DOCUMENT RESUME 



ED 224 218 



EC 150 601 



AUTHOR 
TITLE 

INSTITUTION 
SPONS AGENCY 
PUB DATE 
GRANT 
NOTE 

PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



White, Karl R. ; And Others 

A Meta-Analysis of Previous Research on the Treatment 
of Hyperactivi ty . Final Report . 

Utah State Univ., Logan. Exceptional Child Center. 
National Inst, of Education (ED), Washington, DC. 
[82] 

NIE-G-80-0008 
120p. 

Reports - Research/Technical (143) — Information 
Analyses (070) 

MF01/PC05 Plus Postage. 

*Drug Therapy; Exceptional Child Research; 
♦Hyperactivity; * Intervention; Literature Reviews; 
♦Research Methodology 
*Meta Analysis 



ABSTRACT 
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The meta-analysis approach used for the project required locating all 
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various characteristics of studies that might have affected the 
results, and using relational and descriptive statistical techniques 
to summarize study outcomes and examine the covariation of study 
characteristics with outcomes. Overall results of the study suggested 
that drugs are a moderately effective treatment for hyperactivity; 
however, a significant number of variables were identified in the 
aaalysis which require further research, including age of children 
and IQ. Among appendixes are references to and analysis of review 
articles, a coding sheet for efficacy of drug treatments for 
hyperactivity, coding conventions for efficacy of drug treatments for 
hyperactivity, and procedures for contacting authors for additional 
information . (Author/Sw) 
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1. INTRODUCTION 

Science is built up with facts, 
as a house is with stones; but a 
collection of facts is no more a 
science than a heap of stones is 
a house.— Poincare in L'Hypothese 

Most researchers in the social sciences would agree that their business 
includes the collection of facts. Furthermore, policy makers, administrators, 
and practitioners expect these facts to lead step by step to improve our 
understanding and contribute to improvements in practice. Unfortunately, 
examples of where facts from many individual research studies have been 
logically fitted together resulting in major changes and improvements in the 
social sciences ar^e difficult to find. 

In recent years, the integration of research has received substantial 
attention (Cooper, 1982; Feldman, 1971; Glass, 1976; Jackson, 1980). This 
attention has probably stemmed in part from the frustration of individual 
researchers and funding agencies about the lack of cumulative knowledge 
steimiing from individual research studies. More and more people have devoted 
their attention to the importance of doing high quality, integrative reviews. 
In 1978, Greg Jackson pointed out that even though the reviewing and 
synthesizing of empirical research on a given topic is a fundamental activity 
in soG^ial science research, there is an absence of well-defined methods or 
procedures for conducting such reviews. Jackson concluded that these 
circumstances presented a major limitation to the accumulation of knowledge. 
Based on an analysis of a random sample of review articles and correspondence 
with journal editors and officials from government and private organizations 
responsible for reviewing and synthesizing research in the social sciences, 
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Jackson concluded that there were •Mmportant weaknesses in the currently 
prevailing methods of integrative reviews" (p. 37). The most frequently 
reported oroblems cited by Jackson included: 

(1) the failure of most reviews to consider a complete or representative 
sample of the available evidence; 

(2) the tendency for the conclusions of such reviews to be misleading; 

(3) the failure of reviewers to systematically consider the possible 
relationships between characteristics of review studies and their 
findings; and 

(4) the failure of reviewers to draw inferences for either theory, 
policy, or practice from the results of the studies. 

Feldman (1971) observed that "the half-hearted commitment in reviewing and 
integrating completed research might account in part for the relatively 
unimpressive degree of cumulative knowledge in many fields of the behavioral 

■r 

sciences" (p. 86). 

Emphasizing the importance of integrative research. Glass (197^) pointed 

out: 

A good review is the intellectual equivalent of original 
research .... we need more scholarly effort concen- 
trated on the problem of findinq the knowledge that lies 
untapped in completed research studies .... The best 
minds are needed to integrate this staggering number of 
individual studies. This endeavor deserves higher 
priority now than adding a new experiment or survey to 
t\yQ pile. (p. 4) 

An important area of research in which it has been particularly 
difficult to integrate and draw conclusions from the findings of previous 
research concerns the use of drugs to ameliorate the symptoms of hyperactive 
children. According to Trites (1979), hyperactivity is the most frequent 
reason children are referred to clinics and special school services. During 
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the last decade hundreds of research studies have investigated various 
treatments for amel ioratinq the symptoms of hyperactivity. Also referred to as 
"hyper kinesis," "minimal brain dysfunction," and "attention deficit disorder," 
hyperactivity has been called "one of the major childhood disorders of our 
time" (Ross & Ross, 1976). 

In general, a child is considered to be hyperactive if he or she 
consistently exhibits an excessively high level of activity in situations 
where it is clearly inappropriate and is unable to inhibit his or her 
activity on command. Hyperactivity is often characterized by other 
psychological, learning, and behavioral problems, such as impulsivity, low 
.self-esteem, poor academic performance, aggression, and distractibi lity. 
Although there are no firm statistical data, Grinspoon and Singer (1973) 
estimate 4 to 10% of the U.S. elementary school children are hyperactive and 
point out that educators claim the incidence to be as high as 15 to 20% (see 
also Bussey, 1967; Miller, Palkes, & Steward, 1973; Sprague, 1979; Stewart, 
1975). A fairly conservative estimate of 5% would suggest that 1.5 to 2 
million elementary-aged children in the U.S. are hyperactive. Clearly, 
hyperactivity is a problem of significant proportions. 

Drugs are the most frequently used treatment for hyperactivity. In spite 
of hundreds of completed research studies using drugs, there is little 
' agreement about their effectiveness, whether different types of children 
respond differently, or how much the symptoms of hyperactivity can be reduced. 
Discrepant results in the research literature regarding the treatment of 1 
hyperactivity point up the need for a methodologically sound review aimed ^t 
identifying the reasons for these discrepancies and providing more definitive 
information about the effectiveness of using drugs for the treatment of 
hyperactivity. 
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Objectives 

The major aims of the project described in this report were to: 

1. Determine if druqs can be used effectively with hyperactive children 
to decrease activity level, aggressiveness, and impulsivity; and 
improve cognitive performance, attention, academic achievement, and 
behavior. 

2. Determine what child and intervention characteristics (e.g., age of 
child, nature of intervention, involvement of family) covary with 
and/or influence intervention effectiveness. 

3. Prioritize and focus future research efforts by identifying those 
research questions which need further investigation and replication 
as opposed to those questions which have been sufficiently 
investigated, documented, and replicated. 

Significance 

The problems associated with hyperactivity are pervasive. As Barkley 
(1979) noted: 

Hyperactive children are often described af inattentive, 
overactive, and impulsive (Safer & AM en, 3176) . . . . 
Many demonstrate problems in noncompliance to adult 
commands (Barkley & Cunningham, 1979a; Campbell, 1973; 
Campbell, 1975) as well as aggressiveness towards 
others. Academic underachievement (Cantwell & 
Satterfield, 1978) and problems in classroom conduct are 
also evidenced .... As they develop into later 
childhood and adolescence, school failure, poor peer 
relationships, trouble with the law, and secondary 
reactive emotional problems are likely to occur (Ross & 
Ross, 1976). Although their social conduct problems may 
lessen as they enter young adulthood (Weiss, Hechtman, 
Perlman, Hopkins, & Wener, 1979), many are still more 
restless and inattentive than their normal peers and may 
develop problems with alcohol abuse (Blouin, Bornstein, 
& Trites, 1978). Hence, the disorder of hyperactivity 
is a lifelong difficulty associated with chronic 
academic problems and poor social relations, (p. 412) 
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Hundreds of research studies have investigated the effectiveness of 
treating hyperactivity with drugs, but the results are disturbingly discrepant 
and reviewers cannot agree. For example, in commenting on the effectiveness of 
drug treatments, Wender (1971) stated "minimal brain dysfunction is probably 
the single most common disorder seen by child psychiatrists ... the correct 
treatment is often dramatically effective and is always cheap and readily 
accessible" (p. 1). In contrast, Adelraari (1977) concluded that "the widespread 
use of such drugs for treatment of [hyperactivityl is premature. . . and 
perhaps guite dangerous" (p. 401), and Eisertberg and Conners (1971) concluded 
that "the continuing use of drugs despite the freguently negative outcomes of 
controlled studies imlleates' that behavioral disorders in children are placebo 

responsive" (p. 3^*2) • 

Additional examples of the contradictions in the research literature 
concerning the effectiveness of using drugs for the treatment of hyperactivity 
could easily be cited, but would serve little purpose. The current state of 

confusion was summarized well by Freeman (1976) who stated: 

There is only one phrase for the state-of-the-art and 
practice in the field of minimal brain dysfunction, 
hyperactivity, and learning disability in children: a 
mess. There is no more polite term which would be 
realistic. The area is characterized by rarely 
challenged myths, ill-defined boundaries, and a 
strangely seductive attractiveness. These categories 
and their management, because of massive support from 
frustrated parents, professionals, government, and the 
drug and remedial education industries, constitute an 
epidemic of alarming proportions. 

Kinsbourne and Swanson (1979) noted that "so much is known about hyper- 
activity that the information has become confusing. Before more work is done, 
some simplifying generalizations" are needed" (p. 1). One explanation for the 
contradictory conclusions regarding the efficacy of using drugs for the treat- 
ment of hyperactivity is that previous reviews have failed to conduct the type 
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of integrative review which could advance knowledge in the field. Riven the 
large numbers of children affected by hyperactivity, the millions of dollars 
spent yearly on the treatment of hyperactivity, and the contradictions in 
previous research studies and reviews, a high quality review and summarization 
of existing research is urgently needed. As will be documented in the 
following section, existing reviews suffer from major methodological weaknesses 
that may account in large part for their contradictory findings. The methods 
used in the project described herein avoid most of these previous problems and, 
in so doing, yield information which is more credible and comprehensive. 
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2.0 REVIEW OF PREVIOUS EFFORTS TO INTEGRATE THE RESEARCH 

Jackson (1980) suggested that the quality of a review, and hence the 
confidence one should place in the conclusions of the review, can be .judged by 
examining how well the review meets criteria in six areas. 

1. / Selecting A Topic — Was the topic appropriately defined and delimited? 

2. Rev iew of Previous Work— W ere previous efforts to review similar 
bodies of literature cited and critiqued so that: (a) it is clear how 
the present work will differ from or extend previous work; (b) an 
appropriate point of departure for the present work can be determined; 
and (c) the present work will avoid the mistakes of past reviews. 

3 Selecting S tudies to be Reviewed — Were the criteria for selecting 
studies to be reviewed clearly explicated? Was a representative or 
comprehensive sample of previous research on that topic reviewed, so 
that results of the review are general izable to the "population of 
research studies? 

4. Da ta Col lection — Were data collected for each study (so far as 
possible) on common dependent variables (study outcome^) and 
independent variables (study or subject characteristics such as age of 
students, type of intervention, methodological quality)? Were data • 
collection procedures specifically described and defended on rational 
and empirical grounds? 

5 Data Analysis— Was the relationship between dependent and independent 
variables examined in both univariate and multivariate dimensions? 
Were appropriate analysis techniques utilized? 

6 In terpretation and Reporting- Were results reported in such a way that 
the reader can tell exactly what procedures and operational 
definitions were used? Are conclusions sufficiently supported by the 
data? Could the investigation be replicated based on the information 
reported? 

These criteria can be used as a yardstick in judging the quality of previous 
reviaws on the effectiveness of various treatments for hyperactivity. 

Besides the hundreds and hundreds of primary research studies on 
hyperactivity, dozens of reviews have also been completed. A computer-assisted 
literature search of Psychological Abstracts , Dissertatio n Abstracts 
International , CEC Abstracts , Socictl Science Search , Index Medicus, and 
Education Resources Information Center, followed by a hand search and 
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information derived from bibl ioqraphies of already obtained articles, 
identified 61 articles which have reviewed the efficacy of various tVeatments 
for hyperactivity. To qualify as a "review," the article had to meet at least 
one of the following criteria: 

1. "Review" was used in the title; or 

2. At least 35 primary research studies were considered to examine the 
effectiveness of a particular treatment for hyperactivity; or 

3. At least 10 primary research articles were considered to examine the 
effectiveness of a particular treatment for hyperactivity, and the 
main purpose of the article was not to>eport on primary research 
conducted by the authors. 

The review articles identified and some additional descriptive information on 
each review are listed in Appendix 1. 

Procedures for Examining the Quality of Previous Reviews 

Each of the 61 articles was coded as to how well it met criteria in each 
of the six areas prev/iously described. Questions included on this coding sheet 
are listed below with additional explanation as necessary. 

1. Did the review article explicitly and specifically st ate and delimit 
the topics to be included ? Any statement of the reviewer which 
delimited or defined the types of articles to be included in the 
review was counted as meeting this criterion. For example, this item 
would have been coded "yes" if the author said "tnis review will 
consider all articles which have examined the effectiveness of 
behavioral interventions on hyperactivity which meet minimum standards 
of methodological quality." 

2. Did the reviewer cite articles that are described as previous reviews 
on the same topic%) or on similar topics? This question was coded 
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"yes^' only if the reviewer described such articles in the text and 
referred to them as previous reviews. It was not coded "yes" if 
reviews were listed in the references but were not referred to in the 
text as reviews. 

Did reviewer, critique previous reviews ? 

Did reviewer state how the present review would differ from or extend 
previous reviews? 

Did reviewer state how studies to be included in the current review 
were located? To be answered "yes," the reviewer needed to explain 
the procedures in enough detail so that someone else could replicate 
the review using the same or nearly the same studies. For example, it 
was not sufficient to say that an ERIC search was done. The item 
would be coded "yes" if the authors said an ERIC searcH was done and 
stated the descriptors used in conducting the ERIC search, the years 
which were covered, and whether additional techniques were used to 
identify articles. 

What is the actual number of experimental studies from which results 
were used to address the questions posed? To be counted, an article 
had to be cited in the text as supportinq or refutinq a particular 
point of view about the effectiveness of some intervention for 
hyperactivity. Articles which referred only to methodological issues 
or cited different instrumentation which previous research has used, 
were not counted in this total unless they were also cited pertaining 
to the effectiveness of a particular treatment. 
What is the total number of references cited in the bibliography? 
Did the reviewer describe with at least 250 words the major 
methodological difficulties or shortcomings of the primary or 
integrative research on the given topic(s)? 
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9. Did the reviewer suggest desirable foci or methods for future primary 

or integrative research on the topics? 
10. What methods were used to consider findings of individual studies? 
This guestion asked about how outcomes from individual studies were 
used to draw conclusions about the effectiveness of a particular 
treatment for hyperactivity. 

Each study considered was categorized in one of six areas: (a) 
Effect Size— some type of standardized metric that was comparable 
across all studies, e.g., {J^ - Iq) t SOq, (b) statistical 
significance-- favoring, nonstatist ical significance, or against, (c) 
single subject designs which were visually analyzed, (d) differences- 
each individual study was reported as having found differences or no 
differences, but no reference was made to whether it was a statistic- 
ally significant difference or not, (e) differences (groups)--a group 
of articles were cited as having found differences but did not make 
reference to whether these differences were statistically significant 
and did not consider the studies individually, and (f) percent 
improved— percent of subjects showing improvement following the 
treatment. If all studies in a particular review used the percent 
improved method, it would have been counted as Effect Size since it 
was a standard metric for all studies. However, this was never done. 

In other words, a reviewer might cite 20 studies. For 10 of these 
studies, the percent of subjects improved in each study might have 
been reported; 5 of the studies might have been reported' as having 
found statistically siqnificant differences; 3 of the studies were 
reported to have found differences; and 2 of the studies were reported 
to have found no differences. 
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11. How were findings or results of the reviewed studies summarized to 
draw general conclusions? This question addressed how, after 
collecting data about the effectiveness for a particular treatment 
from the individual studies, the reviewer summarized the results to 
draw conclusions. Each study was coded in one of three categories: 
(a) general direction of findings— no explicit summarization technique 
was used, but the author did draw conclusions about what the studies 
seemed to be showing, (b) Percentage of studies finding "X"--in this 
case, the author considered all studies in the review and said that 
such and such a percentage favored this treatment and such and such a 
percentage favored that treatment, (c) Effect Size for each study- 
after quantitatively summarizing the results of each study on a common 
metric, the author used this common metric to summarize what could be 
concluded from the research. 

12. How was covariance of outcomes with subject or study characteristics 
analyzed? Each article was coded in one of four ways: (a) data 
based multivariate— the reviewer empirically considered the covariance 
of subject characteristics with study outcomes for most of the 
studies, and simultaneous covariance with more than one subject or 

\ study characteristic was'considered, (b) data based univariate— same 
as (a) except that the author only considered one study or subject 
characteristic at a time, (c) logically considered for major subset— 
\ coded if the author logically presented information for a substantial 
subset of the data but did not do it in a systematic, data based 
fashion— for example, if put of 30 articles considered in the review, 
the author pointed out that in a of the articles, younger children 
appeared to do better on treat nent X than older children but made no 
effort to report the age of children in the other 22 articles, it 
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would have been coded "c". (d) not considered for major subset-coded 
if none of the other three were appl icable--for example, an author may 
have pointed out that one research article had found that drug therapy 
worked better for hyperactive children who had orqanic brain damage. 
Unless some effort was made to either logically or empirically consid- 
er the influence of organic brain damage on outcomes of treatment for 
other studies, it would have been coded in this category. 
What best describes the sample of arti cles considered in the review to 
draw conclusions about the topic? This question was coded in one of 
four categories: (a) reasonable approximation of all research-by ^ 
considering the year in which the review was completed, an estimation 
could be made of whether the sample considered in the review was a 
reasonable approximation of all research on the topic. If the 
research considered in the review was not a comprehensive sample, it 
would have been considered in category (b) (representative) if the 
author gave explicit criteria or procedures that were followed to 
assure some degree of representativeness, (c) convenience sample- 
unless the author provided explanation that would have placed the 
article in one of the other three categories or considered a large 
enough sample of articles that convinced us that it was a reasonable 
approximation of all research, it was coded as a "convenience" sample, 
(d) purposive/exemplary research-if the author stated that they 
intentionally limited their study to ori Ty those articles| that met 
predetermined standards for methodological quality, it was coded in 
this category. 

Did the reviewer draw conclusions based on the r esults of the reviewed 
studies about theory, policy, or p ractice? 
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Quality of Previous Reviews 

The results of the coding for 61 review articles is shown in Table 1 for 
each of the questions coded. As can be seen from the data presented in 
Table 1, the 61 review articles do not meet most of the criteria suggested for 
high quality integrative research. More than half of the review articles 
delimit or specifically state the topics to be included. However, even though 
almost half cite previous reviews, critiques of previous reviews and 
explanations of how the present review will differ from or extend previous 
reviews are seldom done. No qwe explained how studies to be included in the 
review were located. In spite of the fact that literally hundreds and hundreds 
of studies have experimentally examined the effectiveness of various 
intervention techniques for hyperactivity, the median number of studies 
included in reviews was only 12.5. Extensive discussion of topics which should 
be central to any high quality review (e.g., ma.ior methodological difficulties 
or shortcomings of previous research; or suggested procedures for future 
primary or integrative research on the topics being reviewed) were done 
infrequently. The procedures used for analyzing the results of individual 
studies or summarizing results across studies and considering how outcomes 
covaried with subjects or study characteristics was seldom done appropriately. 
Almost all of the reviews considered a convenience sample of articles. In 
spite of these serious shortcomings, reviewers did not hesitate to draw 
conclusions about theory, policy, and practice based on their review. 
Unfortunately, the basis for such conclusions and the procedures used to reach 
those conclusions would make it impossible for the reader to have confidence in 
the credibility of the conclusions. 

Kerlinger (1977) has stated "the basic purpose of scientific research. . . 
is to understand and explain theory. Science then really has no other purpose 



i Table 1 

Results of How Well Reviews of HyperactivHy Meet Criteria for High Quality Research 



YES 

YES Briefly NO OTHER 



1. i Old review article explicitly and specifically state and delimit 
■ the topic(s) to be included? 


52% 




48% 




2. OitJ reviewer cite article(s) that are described as previous reviews 
on the same topic(s) or on similar topics? 


45% 




55% 


Mean = 2.75 ^ ^ 
Median =.5 Range =0^33 


3. Did reviewer critique previous reviews? 


0% 


3.3% 


96.7% 




4. Did reviewer state how this review would differ from or extend 
previous reviews? 


6.7% 




93.3% 




5. Hid reviewer state how studies to be Included were located? 


0% 




100% 




\ 

6. What Is the actual liiimber of experimental studies cited in the 
review as siipport for a particular contention? 


Mean = 24.5; SD = 37.0; Range = 0 to 184; 
Median =12.5 


7. What Is the total number of references cited in the bibliography? 


Mean = 70.3; SD = 109.9; Range 
Median = 45.5 


= 1 to 80 


3; 


B« Considering the experimental studies cited In reviews, what methods 
were used to present results or findings of Individual studies? 


Effect 
size 

1% 


statistical 
Significance 

11.3% 


Single 
Subject 

4.8% 


□lffercncf»-> 
(each study) 

42.6% 


Differences 
(groups) 

40.0% 


Percent 
Improved 

1.3% 


9. How were findings or results of the reviewed studies summarized 
to draw general conclusions? 


General Direction 
of Findings 

91% 


Percentage of btudiej 
Finding "X" 

7%, 


ttr 


eci size ror 
Each Study 

2% 


10. How was covarlance of outcomes with subject or study characteristics 
analyzed? 


Dald Rased 
Multivariate 

0% 


Data Based 
Univariate 

0% 


Logical ly Lon- 
sidered for 
Major Subset 

5.0% 


Not Considered 
for Major 
Subset 
95.0% 


11. What best describes the sample of articles considered In the review 

to draw conclusions about the topic? 
_^ \ 


Reasonable Approx. 
of All Research 

5.0% 


Representative 
0% 


Convenience 
93.3^ 


furpos ive/ 
Exemplary 

1.7% 


12 Oid reviewer draw conclusions, based on the results of the reviewed 
studies, about theory, policy, or practice? 


30% 


YES 
Briefly 

70% 


NO 





Note : All percentages indicate the percentage of the 61 reviews except where noted. 
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than . . . understandinq and explanation" (p. 5). Kerlinqer's statement, 
similar to common and acceptable definitions of science, represents the basic 
motivation and rationale for conducting educational research. Apparently, 
those who support and conduct educational research believe that the results of 
such research will improve our understandinq and explanation of the educational 
process and thereby lead to the development, implementation, dissemination, and 
adoption of practices which will improve the quality of our educational system. 

During the last 20 years, we have witnessed an explosion in the amount of 
educational research being conducted. Hundrecis of thousands of research 
studies are completed every year. Journals are floi^ded with articles reporting 
research. Archive systems such as ERIC and Dissertation Abstracts 
International have made the results of hundreds of thousands of unpublished and 
fugitive manuscripts more readily available, and computer-assisted 
bibliographic searches have made comprehensive and complicated searches for 
existing literature much more feasible. Still, an all too familiar criticism 
of educational research is that, in most cases, it has failed to have 
significant impact on improving the quality of educational practices (Clifford, 
1973; Kerlinger, 1977; Shaver, 1979; Strike, 1979). 

The analysis of reviews on hyperactivity suggest that one major 
explanation for this lack of cumulative knowledge is the methods that have 
typically been used in attempting to integrate completed research. Although 
reviewers frequently criticize the methodoloqical quality of primary research 
studies and call for more studies with improved methodology to be conducted, 
this analysis suggests that the same criticism can be made of rev-iewers. If 
reviews would attempt to meet criteria in the same basic areas which they 
recommend for primary research, then more progress would be made in addressing 
the problems of educational research and drawing conclusions which can be 
defended and can lead to improvements in practice and policy. 
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3. PROCEDURES 

As demonstrated in the previous section, many of the reviews which have 
examined the effectiveness of various treatments for hyperactivity suffer from 
serious deficiencies. Although a number of approaches have been suggested for 
reviewing literature, most of them also fail to meet the criteria for high 
quality research. A brief summary of some of the more frequently used 
techniques for reviewing research is presented below. Based on this summary, 
it is suggested that the meta-analysis approach recommended by Glass and his 
colleagues (Glass, 1978; Glass, 1980; Glass, McGaw, & Smith, 1981; Glass & 
Smith, 1978, 1979; Smith & Glass, 1980) provides the best approach for 
reviewing and drawing conclusions about previously completed research in areas 
such as the treatment of hyperactivity. 

Alternative Approaches for Integrating Completed Research 

The most commonly used technique for reviewing research is a narrative 
approach based on a group of easily accessible articles from fairly prominent 
journals or other publications. Using 20 to 40 research articles, the reviewer 
offers a verbal synopsis of each article, sometimes critiquing the methodology 
and credibility of the conclusions, and often concluding that the existing 
research is inconclusive—sometimes researchers reach one conclusion, sometimes 
another. A call is then made for additional research using better techniques 
and more precise methodology s6 that the truth of the matter can be 
discovered. 

In a slight variation of the narrative review approach, the reviewer 
begins with a similar group of articles but eliminates all but a. small number 
because of supposed design or analysis flaws. The findings of the remaining 
"acceptable" studies are presented as the truth of the matter. Unfortunately, 
a judgment as to what constitutes a good article frequently differs from 
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reviewer to reviewer, and criteria for selecting "methodologically superior" 
articles are often overly restrictive and result in very small, and frequently 
nonrepresentative, samples of articles being considered. Moreover, as Smith 
and Glass (1980) have pointed out, "methodologically good" studies often report 
contradictory findings which can create considerable difficulty in evaluating 
what conclusions should be reached. 

A more systematic approach to integrating the outcomes of primary research 
is what Light and Smith (1971) refer to as "the voting method". In the voting 
method, a relationship between a dependent and an independent variable is 
tallied as positively statistically significant, negatively statistically 
significant, or non-statist ical ly significant. Studies are not usually 
weighted according to the size of the sample utilized in conducting the 
research. Since larger sample sizes lead to a greater probability of 
concluding that results are statistically significant, the voting method 
systematically discriminates against studies with smal 1 samples. Consequently, 
the true relationship may never be detected, and/or misleading conclusions may 
be drawn. Additionally, the voting method incorrectly implies that inferential 
statistics reveal the degree or importance of relationship, and that artifacts 
of measurement, bias, and the issues of experimental validity are controlled 
for adequately in all studies. As Glass (1977) pointed out, nine small sample 
studies may yield not-quite-significant results in one direction while a tenth 
large sample study yields statistically significant results in the opposite 
direction. The vote in this case is one for and nine against— a conclusion 
quite at odds with one's best instincts. 

In an effort to improve on the voting method. Light and Smith (1971) 
concluded that "... progress will only come when we are able to pool, in a 
systematic manner, the original data from the studies" (p. 243).' 
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Unfortunately, oriqinal data from studies is frequently extremely difficult to 
obtain, and this procedure must disregard any researcher's data which is not 
obtainable. Glass (1977) reported that Wolins (1962) wrote to 37 authors 
asking for their data from studies published in the preceding two years; five 
did not reply, 21 reported that their data were irretrievable, two refused to 
share the results, and four sent their data too late to be useful. 

The approach used in this project is referred to as meta-analysis and was 
first proposed by Glass (1976). Properly implemented, the "meta-analysis" 
approach meets all of the criteria for high quality integrative reviews 
proposed by Jackson (1978). Conducting a meta-analysis requires the location 
of either all studies or a representative sample of all studies on a given 
topic, converting the results of each study to a common metric, coding the 
various characteristics of studies that might have affected the results, and 
using relational and descriptive statistical techniques to summarize study 
outcomes and examine the covariation of study characteristics with outcomes. 
In his critique of previous efforts to integrate the findings of social science 
research, Jackson (1978) concluded that the "meta-analysis approach is a very 
important contribution to the social science methodology . . . . it will often 
prove to be quite valuable when applied and interpreted with care" (p. 47). 

Since its introduction, the meta-analysis approach has been used to 
integrate research findings on a wide variety of topics including the relation 
of class size to achievement (Glass & Smith, 1979; Smith & Glass, 1980); the 
relation of socioeconomic status and academic achievement (White, 1982); the 
effectiveness of training and reinforcement on standardized test, results ^ 
(Taylor & White, 1982); and neuropsychological assessment for brain damaged- 
children (Davidson, 1978). More than 100 completed meta-analysis studies 
suggest that meta-analysis techniques are accepted as a useful methodology by 
substantial numbers of professionals. 
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It should be noted that some educational researchers have raised- questions 
about the usefulness of meta-analysis (Educational Research Service, 1980; 
Eysenck, 1978; Gallo, 1978, Mansfield & Bussey, 1977; Shaver, 1979; Simpson, 
1980). Some of these hav? questioned the results of a specific meta-analysi? , 
while others have raised cautions or concerns about the meta-analysis approach 
per se. Most of these criticisms and cautions have been responded to in the 
literature (Glass, 1978, 1980; Glass et aK, 1981; Glass & Smith, 1978). The 
most important point the concerns and questions have demonstrated is that 
meta-analysis, like all other research procedures, is not a fail-safe approach. 
If applied carelessly, many problems will occur. However, the meta-analysis 
approach, if properly implemented, has excellent potential as a tool for 
integrating research about the effectiveness of various treatments for 
hyperactivity. 

Procedures for the Hyperactivity Meta-Analysis 

The specific activities and procedures used in conducting the 
meta-analysis of the research on the treatment of hyperactivity are described 
below for each of the six areas suggested by Jackson (1980) for determining 
whether an integrative review is of high quality. Examples from previous 
meta-analyses are used to provide supporting evidence for the advantages of the 
meta-analysis approach and additional detail on the procedures to be used. 

1. Selecting and delimiting the topic . The way in which the 
investigation^ of any research topic is defined determines in a large part the 
questions wmch will be answered. A topic which is too narrowly defined may 
only be able to answer trivial questions or may overlook important conclusions 
revealed by previous research. A topic which is too broadly defined may lead 
to the consideration of studies which are so divergent as to be uninteresting. 
Included in this integrative review were all those studies that have 
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empirically investigated the efficacy of drug treatments for hyperactivity. 

Key terms in the preceding statement are defined below: 

Drug Treatment - Any treatment which attempts to ameliorate the symptoms 
of hyperactivity by administering a drug or chemical substance to the 
subject. 

Hyperactivity - Any pattern of behavior or activity level demonstrated or 
considered to be excessive. This definition is necessarily broad. 
Researchers employ a wide variety of criteria for defining hyperactivity. 
These range -^rom accepting the opinion of a parent, teacher, or physician 
that a subject is hyperactive, to making systematic observations of 
subjects or using electro-mechanical devices to measure motor activity. A 
system for coding these various methods of defining hyperactivity was used 
as ^ part of the coding system described below. 

Although some have argued that integrative reviews should only consider 
methodologically superior studies, our experience has been that this freguently 
fails to consider studies which can provide important information. The 
relation between study outcomes and methodological adeguacy can be empirically 
assessed as a part of the meta-analysis. Then, if it is determined that the 
methodological adeguacy of studies is confounding the results, appropriate 
adjustments can be made. 

It should be noted that decisions concerning what to do ; about 
methodological inadeguacies are different for a person conducting a primary 
research study than for a person integrating the results of previous studies. 
As Glass (1977) has noted, a researcher does not set out to perform a study 
deficient in some aspect of measurement or analysis, but it hardly follows that 
after a less than perfect study has been done, its findings should not be 
considered. 

Many weak studies can add up to a strong conclusion. 
Suppose that in a group of one hundred studies, studies 1 to 
10 are weak in representative sampling but strong in other 
respects^ studies 11 to 20 are weak in measurement but 
otherwise strong, studies 21 to 30 are weak in internal , 
validity only, studies 31 to 40 are weak only- in data 
analysis, etc. But imagime also that all 100 studies are 
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somewhat similar 'in that they show a superiority of the 
experimental over the control group. The critic who 
maintains that the total collection of studies does not 
support strongly that conclusion of treatment efficacy is 
forced to invoke an explanation of multiple causality (i.e., 
the observed difference can be caused either by this 
particular measurement flaw or that particular design flaw or 
this particular analysis flaw or . . .). The number of 
multiple causes which must be Tnvoked to counter the 
"explanation of treatment efficacy can be embarrassingly large 
for even a few dozen studies. Indeed, the multiple defects 
explanation will soon grow into a conspiracy theory or else 
collapse under its own weight. Respect for parsimony and 
good sense demands an acceptance of a notion that imperfect 
studies can converge on a true conclusion, (p. 356) 

Of course, it is also possible that methodologically weak studies will 
yield biased or misleading results. For example, as explained in the results 
section, from the hyperactivity data considered in this project, drugs appeared 
to be substantially more effective in reducing the symptoms of hyperactivity 
when all studies were considered than when the analysis was limited to those 
studies which used control groups, met minimum standards Of internal validity, 
and used objective measures to select hyperactive children for the study arid to 
measure outcomes. 

As these results demonstrated, the best approach for determining whether 
"weak" studies yield biased ^results is empirical. Each of the studies included 
in this meta-analysis was classified according to well defined criteria which 
are thought to impact on methodological quality (e.g., type of control group, 
reliability or fakability of outcome measures, "blinding" of judges, duration 
of intervention). Because "weaker" studies yielded different outcomes than 
"stronger" studies, more credence was placed in the results of the "stronger" 
studies. However, if the results had been similar, the inclusion of additional 
Studies would have allowed other important questions (e.g., the influence of 
age of child or duration of treatment) to be examined more completely. 
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In summary, any study which investiqated the efficacy of drug treatments 
for hyperactivity was considered in the meta-analysis. By considering all of 
these studies, guestions of whether methodological adeguacy covaries with 
results were examined empirically while at the same time substantially 
expanding jthe data base so that guestions of how other study characteristics 
covary with study outcomes could be considered more adeguately. This approach 
does not in any way condone future experiments that have weaknesses of design, 
analysis, or measurement. 

2. Reviewing previous work on the same topic . As has already been noted, 
one part of this project was to examine previous reviews which have attempted 
to integrate the research literature on hyperactivity. In addition to 
demonstrating the need for a project such as this, the analysis of previous 
reviews often provides important information which can be the key to making 
sense out of the research literature. For example, in his meta-analysis of the 
research literature which investigated the relationship between socioeconomic 
status and academic achievement. White (1982) found that the unit of analysis 
used in computing the correlation between SES and achievement accounted for 
almost 40515 of the variance in previously obtained correlation coefficients. As 
shown in Figure 1, those studies which had used individual students as a unit 
of analysis had a median correlation coefficient of .22, while those studies^ 
which had used group means in computing the correlation coefficients had a 
median correlation of .73. This one factor alone did much to clear up the 
confusion about how strongly SES is related to academic achievement. However, 
the unit of analysis used in computing the correlation coefficient was not an 
obvious factor to consider in conducting integrative review of the 
SES-Achievement correlation. Indeed, the "unit of analysis" variable was 
included, based on the suggestion of another reviewer even though the previous 
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reviewer had not presented enough evidence to substantiate the importance of 
the variable. If "unit of analysis" had not been considered, important 
questions regarding the relation between SES and achievement would not have 
been resolved. 



The same principle applied to integrating the literature on the 
effectiveness of various treatments for hyperactivity. For example, suppose 
100 research studies are considered, 50 of which implemented an intervention 
for hyperactivity and measured the outcome in a structured setting and 
50 of which implemented the intervention and measured the outcome in an 
unstructured setting. Further suppose that those interventions in structured 
settings were very successful and all of the intervention programs in 
unstructured settings were completely unsuccessful. Finally, suppose that 
degree of structure in the setting where the intervention was implemented was 
not systematically considered in trying to organize and interpret previous 
research results. In this admittedly oversimplified and exaggerated example, 
the reviewer would probably conclude that the research concerning intervention 




a) Using students as the unit of 
analysis 



b) Us Inn nrnuns as the unit of 
analysis 



Figure 1. Distribution of obtained correlation coefficients of the 
relationship between socio-economic status and academic 
achievement from 100 students. 
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for hyperactivity is inconclusive— sometimes the intervention is effective, 
sometimes it is not. . 

Such an obviously wronq conclusion would occur because the correct 
concomitant variable was ^not considered. In spite of how obvious such an 
oversight appears when presented in this manner, this is exactly the type of 
mistake that almost all other reviewers of the hyperactivity literature have 
made. The best way to be suVe that critical factors are included for 
consideration in the meta-analysis is to conduct a thorough review of what 
other people have suggested as potentially important factors and then to 
consider each of these factors to the degree possible in all of the primary 
research studies. Those factors which the analysis of previous 'reviews 
suggested are important for the hyjier activity research literature are included 
on the coding sheet used for the project which is included in Appendix 2. 

Another reason for doing such an extensive analysis of previous reviews as 
the first step in conducting the meta-analysis is that it-provided historical 
information (with specific references) about the most important issues that 
should be resolved by the meta-analysis. Conclusions of the meta-analysis 
regarding such issues can be referenced back to these contentions to either 
confirm or reject existing notions or hypot'heses. 
' , 3. Selecting studies for inclusion in the review . The studies considered 

in the meta-analysis were identified by doing a computer search of the follow- 
ing indexes- -Psvchological Abstracts , Dissertation Abstracts International, 
CEC Abstracts , Social Science Search , Science Search , Index Medicus , and 
Education Resources Information Center (ERIC). Approximately 300 articles were 
identified as relevant for the meta-analysis. As each article identified 
through the computer search was read and coded for the meta-analysis, the 
biblAography of that article was examined to see if additional articles were 
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referenced which would be appropriate for the meta-analysis. These articles 
were then obtained and included in the meta-analysis. Not all of these 
articles could be included in the meta-analysis because of insufficient 
information being reported. 

Included in the articles to be coded for the proposed study were both 
published and unpublished research reports. The importance of considering 
research from a variety of sources is clearly demonstrated by the information 
in Table 2 which was taken from Glass, Smith and Barton (1979). As can be 
seen, in nine different meta-analyses, the results often varied substantially 



Table 2 

Average Effect Sizes in Results of Studies 
from Different Sources 



Investiqator 


Topic 


Source of 


Publ icatlon 






Journal 


Book 


Thesis 


Unpublished 


Kavale (79) 


Psychol inguistic 
training 


.50 




.30 


.37 


Hartley (77) 


Computer-based 
Instruction 


.36 




.28 


.54 




Tutoring 


.77 \ 




.40 


1.05 


Rosenthal (76) 


Experimenter bias 


1.02 




.74 




Smith (78) 


Sex bias In 
psychotherapy 


.22 




-.24 




Smith (80) 


Effects of aesthetic 
education on basic 
skills 


1.08 




.48 


.50 


Carlbert (79) 


Special class vs. 
regular class 

Resource room vs. 
regular class 


-.09 
.32 


..01 
-.09 


..16 


..14 


Miller (79) 


Drug therapy of 
psych disorders 


.49 


.56 






Hearold (79) 


TV and anti-social 
behavior 


.40 


.14 


.18 


.23 


Smith, Glass» 
Miller (80) 


Psychother aoy 


.87 


.80 


.66 


1.96 



I 



Note: An Effect Size (ES) is defined as the standardized mean 
difference between two groups. Mathematically. ES '^J^ SO^- 
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\from source to source. Note, for example, the results of Smith (1978) who 
Considered the presence of sex bias in psychot\ierapy. Studies which' had been 
published in journals showeo that women were sy^^tematical ly discri/ninated 
against during psychotherapy, whereas studies rei?orted in theses showed a bias 
in favor of women durir-; psychotherapy. \ 

Questions of Type I errors, bias, and quality\of research reported in 
different sources are too important to be ignored when considering the 
questions of what previous research has really concluded about a specific 
topic. Research that is unpublished or reported in gbvernment project reports 
is usually reported regardless of the results, whereas\some have suggested that 
research has a better chance of being published in journals or books if the 
results show statistically significant differences or agree" with contemporary 
points of view. The primary objective of the meta-analysis of the hyperactiv- 
ity literature was not to resolve questions about publication bias. However, 
the necessity of considering previously completed research from all sources is 
clearly evident if one is to draw valid conclusions about what can be concluded 
concerning the effectiveness of treatments for hyperactivity. For example, as 
reported in the results section, we found that the average effect size for 
studies supported in whole or part by commercial drug companies was .55 (n = 
468), while the average effect size for those studies not supported by 
commercial drug companies was only .18 (n = 118). Clearly, any interpretation 
of the hyperactivity literature must at least consider who, sponsored the 
research. 

To assure that as many studies as possible were included in'the meta- 
analysis, some articles were obtained from sources other than USU. The library 
system at USU was sufficient for obtaining the majority of the articles 
identified. However, additional efforts were sometimes necessary, including 
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utilization of the Interlibrary Loan System; requests to Dissertation Abstracts 
International; letters to authors requesting copies of unpublished reports; and 
requests to government archive systems. Although this was a time consuming and 
consequently an expensive undertaking, it was an important one if meaningful 
conclusions were to be made about what could be concluded from previous 
research./ 

4. / Data collection . The key to a successful meta-analysis is the appro- 
priate development of the coding/classification system. The basic concept 
behind the meta-analysis approach is to quantify the outcome of all research 
studies on a metric that can be used for all studies in the sample, and then to 
code the various study characteristics which may covary with outcomes in order 
to determine whether or not studies with certain characteristics consistently 
result in one outcome while another type of study produces another outcome. 
The classification system used to code the studies is the basic data collection 
instrument. This classification system must be comprehensive enough to include 
those factors which are contributing to the variance among different studies, 
but cannot be so complex that coding studies becomes an overly burdensome 
task. 

The development of the coding/classification system is an extremely 
important task in conducting a meta-analys''is. Appendix 2 shows an example of 
the coding/ classification system used in co^inq 'studies on drug treatment of 
hyperactivity. As can be seen, this classi,f ication system includes information 
about dozens of factors that other reviewers, researchers, and the project team 
thought might be important in explaining /the results of research- investigating 
the efficacy of 'drugs for the treatment ^of hyperactivity. As can be seen in 
Appendix 2, the coding sheet was divided' into eight sections: 

1. Identifying information on th^ article being coded, 



2. Description of the research sample. 
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3. Description of how subjects were classified as hyperactive, 

4. Description of the type of treatment given to subjects, 

5. Description of the research design, 

6. Description of potential threats to study validity, 

7. Description of the research outcome and conclusions, and 
B. Description of the specific drug treatment employed. 

Another important step in the meta-analysis is the development of the 
conventions by which decisions are to be made in classifying each of . the 
variables in the coding/classification system. In other words, it is not 
enough to say that factors related to the internal validity of a study will be 
coded. One must also specify the basic decision rules which will be used in 
determining, for example, whether selection bias is a threat to the internal 
validity of a study. For some factors, these decision rules are obvious and 
need not be specified in great detail. For others, it is critical that the 
decision rules be explicitly specified so that replication could occur. The 
basic conventions used for this project are included in Appendix 3. 

Another important part of the procedures for coding individual studies 

was the development of examples which clarified the basic conventions. After 

coding was initiated, many situations were encountered which were not covered 

by the basic conventions. As coding proceeded, examples of how specific 

conventions were interpreted were noted in an example notebook. In this way, 

the rationale used for past decisions was documented and served to keep future 

decisions consistent. The following examples of how "instrumentation" threats 

to internal validity were coded will clarify this procedure. 

Code #1 (minor threat): Dependent variable for project consisted of 

continuous 15-minute segment of observation data for each child 
gathered only once during 6 weeks of intervention. Threat of 
sampling error. 
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Code #1 (minor threat): Dependent variable was observation data 

collected at different times by different judges for experimental 
and control groups. Ratings required moderately high inference 
judgments with vaguely stated criteria. Raters were "blind". 

Code #2 (moderate threat): Dependent variable consisted of pre/post 

opinion ratings as to degree of improvement. Staff members knew 
' some children were receiving treatment but did not know which 
ones. 

' Code #2 (moderate threat): Dependent variable was observation data with 
fairly w^ll specified low inference rating system. However, 
raters were not blind as to who was receiving treatment and had 
some cau^ to be biased. 

Code #3 (major threat): Outcome measured on pre/post design with 
dependent variable being opinion of staff as to degree of 
improvement with no criteria. Staff knew subjects were being 
treated for hyperactivity. 

The most important piece of information to be coded for each article was 
the outcome of the research. The basic outcome measure for each study 
examined in the meta-analysis was an effect size (ES) defined as 
j£ -Jc i SDq. In other words, the ES or outcome for each study was 
defined as the difference between the means of the treatment (i.e., 
"experimental") and the control subjects on a given dependent variable divided 
by the standard deviation of the control group on that variable. Thus, an ES 
of +1.0 as indicated in Figure 2 would indicate that the average person in the 
treated group is one standard deviation above the mean of the control group on 
that particular measure. This measure of ES avoids many of the problems 
encountered in using statistical significance as a measure of the outcome, 
since it is independent of the size of sample and has similar meaning across 
all studies and dependent variables'. Quantifying the results of each 
individual study into an ES which has similar meaning across all studies 
allowed comparisons and cross tabulations with other study characteristics. 
Thus, questions about whether certain types of studies are "more effective" 
could be answered. 
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Figure 2- Graphic representation of distribution for experimental and 
control groups with an Effect Size (ES) of +1.0. ' 

An obvious problem in this approach is that many reports of research do not 

provide sufficient information to calculate in ES using the - Jq * SOq 

definition given above. Where means and standard deviations were not reported, 

it was frequently possible to obtain estimates of the ES using information from 

reported statistics (e.g., F_ ratios, t values, rxy, etc.) For example, if a 

study failed to report the standard deviation of the control group for a 

particular dependent variable, the square root of the within cell mean square 

(MS) from a one-way analysis of variance could be used as an estimate of the 

standard deviation of the control group for that dependent variable (Glass et 

al., 1981). Another example— suppose a study reported the obtained _t value for 

a particular comparison between two groups but did not report means and standard 

deviations for the groups. Assuming that the vari ances^hetween the two groups 

are equal (a standard assumption of the _t test), the equation for the obtained t_ 

ratio was solved to yield an estimated effect size as follows: . 
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other equations for estimating effect sizes (ES) from analysis of variance 

summary tables (either one way or factorial designs), reported £ ratios, 

probability levels, analysis of covariance results, matched pairs t^ tests, and 

other summary statistics were used as outlined by McGaw and White (1981). 

Many times, enough information was not reported or the information which 

was reported was reported in such a way that it was impossible to estimate an 

effect size (e.g., a probability level from a chi-square test, or an F_ ratio 

with no supporting ANOVA summary table from a repeated measures ANOVA design). 

In these cases, authors were contacted to obtain information about means and 

standard deviations to calculate the effect sizes. The procedures used for 

, writing to authors for additional information are shown in Appendix^ 4. 

5. Data analysis . There is very little which is complex or statistically 

unique about the data analysis of the information produced from the coding of 

studies in a meta-analys-is. As Glass et al. (1979) have noted: 

The approach to research integration referred to as 
"meta-analys'is" is nothing more than an attitude of data analysis 
applied to quantitative summaries of individual experiments. By 
recording the properties of studies and their findings in 
quantitative terms, the meta-analysis of research invites one who 
would integrate numerous and diverse findings to apply the full 
power of statistical methods to the task. Thus, it is not a 
technique. Rather It is a perspective that uses many techniques 
of measurement and statistical analysis. 
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The most useful data-analysis techniques in meta-analysis studies are 
frequently the most simple. After coding all of the study characteristics and 
outcomes of the studies, frequencies and mean effect sizes were computed for 
each variable. Next, cross-tabulations with the effect size were computed for 
each of the relevant study characteristics which have been coded. For example, 
an average effect of .85 for 100 effect sizes of methyl phenidate and an average 
effect size of .25 for 150 effect sizes of dextroamphetamine would indicate 
that using methyl phenidate results in approximately six tenths of a standard 
deviation better gain across all dependent variables than dextroamphetamine. 
Thi's finding could be broken down further to see if the advantage of 
methylphenidate is constant for subjects at all age levels, e.g., 4 to 6 years, 
7 to 9 years, and 10 to 12 years. The results could be broken down still 
further in a three-way tabulation to look at the general methodological quality 
af the study^ as it interacts with these other two variables. In this manner, 
various combinations of the study characteristics were examined to determine 
how outcomes covary with the characteristics. 

6. Interpreting and reporting the results . Scientists generally have 
given much import to the interpretation and reporting of their research. 
Reports of research are supposed to be thorough enough to allow other people to 
judge the validity of the findings and interpretations, and to replicate the 
research should they so desire. It is generally believed that reports of 
primary research ought to indicate at least the sampling procedures, essential 
design characteristics, the data collection techniques, the methods of 
analysis, and the findings. These same standards ought to be applied to the 
reporting of integrative reviews, but frequently are not. 

The systematic procedures for collecting and analyzing data in the 
meta-analysis allows the results to be reported in enough detail so that others 
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can judqe the plausibility and validity of the findings. The explicit and 
systematic manner in which the meta-analysis was conducted also helps to ensure 
that interpretations do not overstep the quality of the data which have been 
collected. In many reviews, the degree to which the conclusions are supported 
by the data is difficult to. determine since the procedures and techniques used 
to collect, analyze, and interpret the data exist mostly in the mind of the 
reviewer rather than being explicitly stated as procedures. 
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4. RESULTS AND DISCUSSION 

Characteristics of Data Set 

Seven hundred and fifteen effect sizes (ES) were obtained from the 
meta-analysis coding. As explained previously, an effect size was defined as 
the mean of the experimental qroup minus the mean of the control group divided 
by the standard deviation of the control qroup as shown in Formula 1: 

ES ^ \ 'Ic ^ SDc (1) 

This definition of effect size allowed results from one study to be compared 
with results of another study, or results from one outcome measure to be 
compared with results of other outcome measures, without being confused by 
artifacts of statistical significance or scaling. 

The distribution of the magnitude of these effect sizes is shown in Figure 
3. As can be seen, when all 715 effect sizes are considered without regard to 
other subject or study characteristics, the mean effect size resulting from the 
treatment of hyperactivity with drugs is .44 with a median of .40. In other 
words, children who received drugs for the treatment of hyperactivity are, on 
the average, .4 of a standard deviation better off than children who are not 
treated with drugs. This effect size of .40 indicates that a child who has 
received drugs would score at the 66th percentile of a qroup of children who 
did not receive drugs. 

Before examining the ^interactions of various other subject characteristics 
with outcome, some of the characteristics of the data set from which these 
effect sizes were obtained will be described. Overall, the quality of research 
which has examined the effectiveness of treating hyperactivity with drugs is 
better than many have supposed. Of the 715 effect sizes, 567 or 73% came from 
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Figure 3. Distribution of Effect Sizes (X^ " ^ ^^C^ studies which have 

examined the effectiveness of treating hyperactivity with drugs (n = 715). 
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studies with cross-over designs or where subjects were randomly assigned to 
experimental and control groups. Fifty-one percent of the effect sizes were 
obtained from studies which employed a placebo in the control group, and 78% of 
those placebos were judged to be high quality placebos. Most effect sizes 
(approximately 80%) were obtained from studies which took some measures to 
assure that subjects, treatment implementors, and data collectors were blind as 
to which group of children was receiving the treatment. The quality of these 
blinding procedures was often quite good, although improvements would have been 
desirable (e.g., 46% of the effect sizes had good blinding for subjects, 31% 
had good blinding for the treatment implementor, and 31% had good blinding for 
the data gatherer). Forty-two percent of the effect sizes were obtained from 
studies which had excellent or good ratings of methodological quality. 
Thirty-three percent of the effect sizes came from studies with fair ratings of 
methodological quality, and 25% came from studies with. either poor or very poor 
ratings of methodological quality. Although these ratings do indicate that 
there is need for further improvement and rigor in the research which examines 
the treatment of hyperactivity, the ratings also indicate that a good many high 
quality studies are available upon which to base conclusions. 

Most of the studies considered in this meta-analysis were conducted in the 
1970s. A few studies occurred as early as 1945 with substantial increase in 
research activity occurring in the 1970s. The median year in which effect 
lslZLes_ from this meta-analysis came was 1974. As shown in Table 3, most studies 
came from "medical" instead of "educational" journals. However, differences in 
average effect sizes between these two categories were trivial. The number of 
subjects included in experimental groups ranged from 2 to 217 with a median 
sample size for experimental groups being 29. In summary, the conclusions 
which foljow regarding the treatment of hyperactivity with drugs are based on a 
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Table 3 

Frequency and Average Magnitude of Effect Sizes 
in Different Types of Journals 





ES n SEM 


Educational journals 


.43 (207) .05 


Medical journals 


.45 (508) .02 



large number of studies in which literally thousands of children were 
examined in experimental treatments. The studies cut across a broad range of 
years and considered many different outcomes which might be affected by 
treating hyperacUvity with drugs. 

Potentially Confounding Variables 

The overall statement of the effectiveness of using drugs for the 
treatment of hyperactivity (i.e., a median effect size of .40) indicates that 
drugs do have a moderate but positive effect on ameliorating the symptoms of 
hyperactivity. However, the real power in the meta-analysis approach is that, 
it allows the examination of various factors which may interact with this 
general statement of effectiveness. Most interesting are those study and 
subject characteristics which covary with effect size. For example, are 
certains of drugs more effective? Or, do drugs work better with younger rather 
than older children? Or, do drugs have greater impact on certain types of 
dependent measures? However, before examining these questions, it is important 
to consider whether there are variables which may be confounding the 
relationship between study characteristics and effect size and thus, mislead 
researchers. 
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Support from commercial companies . For approximately 6S% of the effect 
sizes obtained, some type of commercial support from druq companies was 
provided to the study. Table 4 shows the average effect size for studies which 
received various levels of financial support from commercial companies. Those 
studies which received complete financial support yielded dramatically higher 
average effect sizes (.79) than those studies receiving no support (.18). 
Table 4 breaks these results down further by quality of research design. As 
can be seen in the panel to the right of that table, the trend for high effect 
sizes being associated with support from commercial drug companies holds true 
for both high quality and low quality research, although the differences are 
more dramatic for research which was of high quality. Data in Table 4 suggest 
that one must be cautious in interpreting the results of research which is 
supported by commercial drug companies. Although certainly not definitive 
evidence, these data do suggest that support for research from commercial 
companies may bias the results of the research. Notice particularly the fact 
that studies which received no commercial support and where the research was of 
high quality had an average effect size very close to zero. 

Some explanation in interpreting the data displayedi in Table 4 will be 
helpful in interpreting the remaining data displays becafjse all tables have 
been constructed using a similar format. "R" indicates the'mean effect size 
for a particular cell. N indicates the number of effect: sizes on which that 
mean is based. SEM is the standard error of the mean foi^ I^. This was 
obtained by dividing the standard deviation for the distribution of effect 
sizes in that particular cell by the square root of n. 

The standard error of the mean helps one determine if apparent differences 
are real or only the result of sampling f luctuiition. For example, in Table 4, 
the differences in average effect size for those studies receiving complete 
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Table 4 



Average Effect Size for Studies Broken Down by 
Amount of Financial Support and Quality 
of Research Desiqn 



Financial 


Overall ES 






Quality 


of Research Design 


High Quality 


Low Quality 


Support from 
Drug Company 




n 


SEM 


1 


2 


3 


4 


5 


(1.2) 
IK n 


SEM 


ES 


(3,4,5) 

n SEM 


Complete 


.79 


(17) 


.16 






.51 

(12) 




1.47 

(5) 






.79 


17 .16 


Partial 


.54 


(451) 


.04 


.43 

(8) 


.51 

(177) 


.58 

(152) 


.58 

(40) 


.55 

(74) 


.51 185 


.05 


.57 


266 .04 


None 


.18 


(118) 


.10 


.56 

(20) 


-.23 

(52) 


.33 

(34) 


' 1.11 
(6) 


.64 

(6) 


-.01 72 


.08 


.47 


46 .10 
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support to those studies receivinq no support from commercial companies runs 
from .79 to .54 to .18 when all effect sizes are considered. The larqer the 
number of effect sizes used in calculating an average effect size, the smaller 
the standard error of the mean will be, all other things being equal. A good 
rule of thumb is that if a "confidence interval" of 2 standard errors of the 
mean around each do not overlap each other, then there is a good chance . 
that the differences are due to sampling fluctuation. For example, in Table 8, 
the average effect size for studies receiving partial support from commercial 
companies was .54 with a standard error of the mean equaling .04. This 
indicates that the best estimate of the true mean for those studies receiving 
partial support from drug companies is somewhere between .46 and .62 (.54 ± 
.08). The best estimate for the average effect size for those studies 
receiving no commercial support is between -.02 and .38 (.18 - .20). Since 
these "confidence intervals" do not overlap, one can be reasonably confident 
that the differences in average effect size between those companies receiving 
partial support and no support are not due to sampling fluctuation. . 

Many of the tables reported in the remainder of this section break down 
overall effect sizes by quality of research design. This has been done because 
quality of research for many variables was found to confound the interpretation 
of overall effect sizes. For each study considered, quality of research desiqn 
was coded from 1 (hiqh quality research) to 5 (low quality research). In 
addition to indicating the average effect size for each rating of research 
designs, the riqht-hand panel in Table 4 categorizes the studies into those 
studies that rereived either excellent or good ratings (1 or 2) as opposed to 
those studies which received moderate, poor, or very poor ratings (3, 4, or 5). 
When no dther indication is given, numbers in parentheses indicate the number 
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of effect sizes ^pon which" an"estiinate was based. The numbers in bold-faced 
type indicate the average effect size as is done in the middle panel in Table 
4. Finally, any estimates of average effect size which were not based on more 
than. five effect sizes have generally been eliminated from these tables. A few 
exceptions have been made where not including an estimate based on a low number 
of effect size would have been misleading. 

Procedures for classifying children . As shown in Table 5, a significant 
problem affecting many of the studies was the frequently inadequate procedures 
used in assuring that children selected for hyperactivity research were truly 
hyperactive. Overall, approximately half of the effect sizes were obtained 
from studies where classification procedures were considered poor (e.g., no 
objective measures such as systematic observations or well-defined ratings were 
used to classify the children as hyperactive). The problem with not using 
better procedures to assure that children selected for such research are truly 
hyperactive is emphasized by the fact that the average effect size for those 
children where the procedures for classification were fair to good was only .34 
(n = 321), and the average effect size for those studies where the 
classification procedures were poor was .56 (n = 278). 



Table 5 

Average Effect Size for Classifying Children as 
Hyperactive by the Quality of Classification 



General 
Hyperactivity 



Quality of Classification 



Good or Fair 
n SEM 



.34 (321) .04 



Poor 

n' SEM 



.60 (259) .04 



Activity Level, 
Al^entiiOn, or 
Aggression 



4j 



.09 



(19) .16 
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As can be seen in Table S, the most frequently used basis for classifymq 
the children as hyperactive for inclusion in the study was a general measure of 
hyperactivity which was used in 81% of all effect sizes obtained. More 
specific measures such as activity level or attention which are supposedly some 
of the defining characteristics of hyperactivity were used very infrequently. 
The data in Table 5 indicate that some of the apparent improvement in children 
treated with druqs may be due to the fact that many of the children included in 
such research are not really hyperactive in the first place. General 
hyperactivity level was the most frequently used basis for classifyinq children 
to be included in the study, and a large portion of those effect sizes came 
from classification procedures that were quite poor. More specific detail on 
how these ratings were made is shown on the coding sheet and the conventions in 

Appendices 2 and 3. 

The possibility that many subjects included in the research may not have 
been truly hyperactive is underscored by information obtained from the ratinqs 
of the severity of hyperactivity. Those subjects who exhibited mild symptoms 
of hyperactivity had an average effect size of .57 (n = 356 effect sizes) 
whereas those subjects who exhibited extreme cases of hyperactivity had an 
average effect size of .35 (n = 59 effect sizes). Again, the much higher 
average effect size for milder cases suggests that even though drugs do have a 
positive impact, the true magnitude of the impact may be overestimated because 
some of^the children included in such treatments may not be truly hyperactive. 

Quality of research . Another major area of concern is the quality of the 
research upon which effect sizes are based. Each study considered in 
the meta-analysis was rated on various factors which might have threatened the 
internal validity of the study. These factors generally followed the 
Campbell -Stanley (1966) paradigm for internal and, to some degree, external 
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validity of research. As can be seen in Table 6, studies with more serious 
threats generally resulted in higher average effect sizes. In addition, Table 
6 shows that instrumentation and mortality were the most frequently occurring 
threats to the validity of the study. 

Table 6 

Average Effect Size for Studies with Various Threats 
to Internal Validity 



- 


No Threat 


Minor 
Problems 


Moderate 
Problems 


Serious 
Problems . 


Maturation 


.36 


'511) 


.62 


(136) 


.66 


(66) 




History 


.40 


(565) 


.53 


(76) 


.58 


(72) 


> 


Testing 


.52 


(517) 


..25 


(139) 


-.04 


(44) 


.60 (13) 


— ^ — '■ — — 

Instrumentation 


.37 


(338) 


.60 


(206) 


.39 


(170) 




Regression 


.41 


(485) 


.49 


(196) 


.58 


(34) 




Selection 


.44 


(546)- 


.46 


(112) 


.40 


' (56) 




Mortality 


.38 


(443) 


.58 


(230) 


.28 


(42) 




Novelty 


.44 


(597) 


.43 


(116) 


.07 


(2) 




Experimenter 
Effect 


.47 


(501) 


.33 


(174) 


.51 


(39) 





Note: See coding sheet and conventions for more complete explanation of how 
each threat to internal validity was rated. 
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The ratings of indrvidual threats to the validity of a study were used in. 
determining a general index of validity for the study (procedures for doing 
this are contained in Appendix 3). Table 7 shows the average effect size for 
those studies which had good ratings (1 or 2 on a B-poijit scale with 1 being 
high and 5 being low) as opposed to moderate or poor ratings. As can be seen, 
the average effect size for studies with high guality research designs is 
somewhat lower than those studies with moderate or poor research designs. 
However, as indicated in the schematic at the bottom of that table, a 95^ 
confidence interval (2 standard errors of the mean) for each of these estimates 
is slightly overlapped. Although there is a trend for better research to show 
lower results, one must be cautious in over-interpreting these results. 

Table 7 

Average Effect Size for Studies with Good Research 
Designs Versus Those with Moderate or Poor 
^ Research Designs 



Quality of Oesig/^ 


n SEM 


Good 
(1. 2) 


.36 (298) .04 


Moderate or Poor 
(3. 4. 5) 


.50 (417) .04 



-\ \ 1 I \ I '"I I ' I Ml H-h 

.20 .24 .28 .32 .36 .40 .44 .48 .52 .56 .60 .64 .68 



Instruments used in collecting outcome data . Also related to the 
quality of the research was the type of instrument used to collect data on the 
outcome measure. Shown in Table 8 are the average effect sizes for those 
studies in which the outcome data were' based on someone's opinion as opposed to 
some sort of systematic i^ating procedure. Studied which obtained data using an 
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Table 8 

Average Effect Size for Studies Which Used Different 

Instruments to Select Children for Study .» 



Type of Instrument 


rS" n - SEM 


Opinion 


.51 (536) .03 


Rating 


.18 aoi) .06 



opinion have substantially, higher average effect sizes than studies which 
obtained outcome data via a rating. These data suggest strongly that the 
apparent effectiveness of using drugs for the treatment of hyperactivity may be 
confounded by the rigor with which data is collected regarding the outcome. 

Placebo effect . A related but different concern is shown in Table 9. 
These data show the average effect size for those studies which used a "no 
treatment" control group as opposed to those studies which used a "placebo" 
control group. The differences in average effect sizes between these two 
groups indicates that substantially lower effect sizes are obtained when a 
placebo was used as opposed to when a na-treatment control group was used. The 
lower effect sizes obtained with placebos indicate that drug treatment "of 
J hyperactivity is to some degree placebo-responsive, as has been suggested by 

spme previous reviewers. The best estimate of th§ magnitude of the placebo 
effect is approximately 1/4 standard deviation (the difference between the 
average effect size. for no-treatment groups and placebo groups). Although the 
magnitude of this placebo effect is substantial, it is not enough to account 
for the apparent effectiveness of utilizing drugs for the treatment of 
hyperactivity. 
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Table 9 

Placebo Effect of Usinq Drjgs for Treatment 
of Hyperactivity 



"Treatment" Used for 


I? 




SEM 


Control Group 


n 


No Treatment 


.71 


(131) 


.06 


Placebo 


.48 


(356) 


.03 



Degree of treatment implementation . Another methodoloqical 
consideration in interpreting the results from drug research is the degree to 
which the treatment was actually implemented. In some research studies, extra 
precautions are taken tp make sure that the subject actually receives the drug 
in the appropriate dosage and at appropriate times. In other studies, no such 
precautions are taken. In those studies considered in the meta-analysis, 
ratings were made of the degree to which one could be confident that treatment 
implementation actually occurred. Table 10 shows the average effect size for 
those studies where there was complete implementation or only minor problems 
with implementation as opposed to those studies where there were major problems 
with implementation. As can be seen, there is a substantial difference in 
average effect size obtained. In what may seem to be counter-intuitive, given 
the results reported above, the average effect size where there were major 
problems with implementation was substantially higher than where there were 
very few problems with treatment implementation. This may have occurred 
because those studies which had major problems with treatment implementation 
also had many other problems in terms of research quality and outcome measures. 



47 

Table 10 

Average Effect Size for Studies with 
Different Degrees of Treatment Implementation 



Degree of Treatment 
Implementation 


n SEM 


Complete Implementation 
or Only Minor Problems 


.12 (212) .05 


Major Problems with 
Treatment Implementation 


.55 (337) -03 



In other words, whether or not checks were made on treatment implementation 
may be indicative of the general quality of the research being conducted and 
lower quality research generally found higher effect sizes. 

Reliability of outcome measures . A final note on methodological quality 
and the ways in which it may interact with estimates of treatment impact is 
shown in Table 11. Here, the average effect size is shown for studies which 
used highly reliable outcome" measures as opposed to those which used less 
reliable outcome measures. As can be seen from the number of effect sizes 
considered in each case, there' were many studies for which no estimate about 
the reliability of the instrument could be made. However, in those cases where 
estimates could be made, studies which had highly reliable instruments tended - 
to show lower effect sizes than those studies which had unreliable 
instruments. 

Summary about potentially confounding variables . The data presented in 
Tables 4 through 11 are important because they must be used in in-terpretinq the 
results of the following section which considers the effectiveness of using 
drugs for the treatment of hyperactivity. These data indicate that there are a 
number of factors that may confound the results reported below. In doing the 
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Table U 

Average Effect Size for Instruments with 
Different Levels of Reliability 



Reliability of 
Outcome Instrument 




n 


SEM 


1.0 - 8.0 


.24 


,(41) 


.07 


.60 - .79 


.48 , 


(103) 


.05 


0 - .59 


.83 


(10) 


.17 



analyses reported in the next section, these potentially confounding factors 
have been accounted for wherever possible. However, the fact that such 
confounds are present In the data and that substantially different magnitudes 
of effect sizes are associated with different levels of these potentially 
confounding variables makes one more cautious about the results reported 
below. 

Effectiveness of Drug Treatment for Hyperactivity 

The most important questions concerning the treatment of hyperactivity 

with drugs are questions such as which drugs are most effective for the 

ft " 

treatment of hyperactivity, do drugs have differential impact on different 
types of outcomes, and are drugs more effective with certain types of children. 
Data presented in the preceding section are useful in helping to interpret 
answers' to questions such as these. 

Relative effectiveness of different drugs . The data presented in Figure 3 
suggest that in general, drugs do have a positive effect on the symptoms of 
'hyperactivity, but' which drugs are most effective? Table 12 presents data 
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which indicate the relative effectiveness for the most frequently used drugs. 
Orily six different drugs appear in this table even though more than 50 drugs 
were identified and coded in the meta-analysis. These six drugs represent 
those drugs which were most frequently used. As can be seen, dextroamphetamine 
and methyl pheni date are the most frequently used drugs; and, according to these 
data, methylphenidate is the most effective. However, differences between the 
various drugs are not great. Given the. relatively small number of effect sizes 
upon which these estimates are based, one can be completely confident about 
this conclusion, 

^ Table 12 

Average Effect Size for Different Drugs (Versus Control 
or Non-Treatment Group) Broken Down by Quality 
of Research Design 



Type of Druq 


Quality of Research 
Hi gh Low 
(1.2) (3,4,5) 


All 


Studies 




n 


SEM 




n 


SEM 




n SEM 


Chlorpromazine 


-.38 


10 


.22 








-.38 


10 .22 


Thioridazine 


.35 


12 


.20 


.25 


6 


.29 


.32 


18 .16 


Dextro- amphetamine 


.21 


55 


.09 


.85 


44 


.11 


. .49 


99 .07 


MethylDhenidate 


.44 


84 


.OR 


.42 


157 


.06 


.43 


241 .05 


Imipramine 








.46 


9 


.15 


.46 


9 .15 


Magnesium 
Remoline 








.50 


30 


.13 


.50 


30 .13 
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More information on the relative effectiveness of druqs is given in Table 
13. These data are taken from those studies where one drug was compared to 
another drug. In general, the data in Table 13 support the conclusion that 
methylphenidate is the most effective drug in reducing the symptoms of 
hyperactivity. For example, in tKose.. studies which compared methylphenidate to 
dextroamphetamine, children receiving methylphenidate were .2 of a standard 
deviation higher across all dependent variables con.sidered than were children 
receiving dextroamphetamine. These results also must be interpreted 
cautiously, however, since only 18 effect sizes are included in this 
calculation. However, given the research which has been conducted, the data in 
Table 12 and 13 suggest that methylphenidate is the medication of choice for 
treating hyperactive children in terms of improvement on dependent measures. 
Given the slim margin of apparent benefit, however, other consider at ibus such 
as cost, side effects, and feasibility of administration should be considered 
carefully in making choices. 



Table 13 

Average Effect Size for Comparisons 
of One Drug with Another 



"Experimental" 
Group 




"Control" 
Group 


■1 <* 


n 


SEM 


Methylphenidate 


vs . 


Dextroamphetamine 


.20 


(18) 


.16 


Methylphenidate 


vs. 


Imipramine 


.38 


(11) 


.21 


Methylphenidate 


vs . 


Thioridazine 


.69 


(24) 


.14 


Dextroamphetamine 


vs . 


Magnesium Pemoline 


.41 


(30) 


.13 
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Age of child . Age has frequently been suggested as an important variable 
in understanding hyperactivity because many authors have suggested that most 
children will "grow out of" being hyperactive by the time they enter 
adolescence. In the meta-analysis, children were categorized into four age 
groups as shown in Table 14. As can be seen, the average effect size for 
children under 9 years of age is approximately double that of children from 9 
to 12 years of age. Although these data are cross-sectional rather than 
longitudinal in nature, they do suggest that drugs are more effective with 
younger children. This may be because in those children where hyperactivity 
persists to the later ages, the condition is more severe and thus, less 
responsive to treatment than hyperactivity in younger children. 



Table 14 

Average Effect Size for Different 
Ages of Children 



Aqe 


n . SEM 


0-84 mos 
(0-8 yrs) 


.40 (95) .08 


85 - 108 mos 
(8 - 9 yrs) 


.50 (441) .03 


109 - 120 mos 
(9 - 10 yrs) 


.28 (107) .07 


121 - 144 mos 
(10 - 12 yrs) 


.26 (60) .11 . 
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Gender . Another consistently reported finding in the literature is that 
more boys than qirls exhibit hyperactivity. Table 15 shows the average effect 
size for studies which had differing percentages of boys included in their 
experimental samples. As can be seen, the studies which included almost all 
boys had an effect size roughly double that of studies which had only 50% boys. 
Furthermore, it can be seen that most effect sizes came from studies which were 
composed primarily of boys. These data could be interpreted in a number of 
ways. Perhaps-, drugs are more effective with b^ys than with girls. 
Alternatively, data may suggest that since educators are convinced that more 
boys are hyperactive than girls, they are more likely to misidentify boys as 
being hyperactive than they are qirls, and the larger effect size for boys has 
been inflated by spontaneous remission. In any case, the differences are 
substantial, f 



Table 15 

Average Effect Size' Broken Down by Percentage 
of Male Subjects in Experimental Group 



% Males in 
Experimental Group 




n 


SEM 


0 - 50% 


.25 


86 


.08 


51 - 85% 


.33 


162 


.05 


86 - 100% 


* 

.52 


467 


• .001 _ 
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Relationship of socioeconomic status . The meta-analysis also examined 
whether children in different socioeconomic groups responded differently to 
dfuq treatment. As can be seen in Table 16, small differences were identified 
between high and low socioeconomic groups, but these differences were not large 
enough to be attributed to anything more than sampling fluctuation. 

Table 16 

Average Effect Size for Different 
Levels of SES 

SES 

1^ n SEM 

High .67 (37) .07 



Medium .60 (232) ' .04 



Low .54 (53) .06 



Length of treatment . Length of treatment is another important 
consideration in administering drugs to children for hyperactivity. Is 
hyperactivity a condition that can be "cured" by the administration of drugs 
like pneumonia; or does administering, drugs only suppress the symptoms but not 
ameliorate the condition? Some evidence on this important guestion is 
contained in the average effect size for children who received drugs for 
varying lengths of time. As can be seen in Table 17, average effect sizes 
tended to increase the longer the treatment was given up to 6 1/2 months. This 
trend was more pronounced when the data were limited to only high guality 
studies. The fact that the trend does not hold true after fi 1/2 months is 
probably attributable to the low number of effect sizes in those instances. 
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Table 17 



Average Effect Size for^Drug Treatment Based on Length of Treatment 
and Broker Down by Quality of Research Desiqn 



Duration of 
Treatment 


Overall ES 
rS" n SEM 


High - 
1 


2 


3 


Qual 
4 


ity of Re 
5 


search Design 

High Quality Low Quality 
(1,2) (3,4,5) 

n SEM n SEM 


1 month 


.18 (148) .06 


.43 

(8) 


.02 

(84) 


.36 

(49) 


.47 

(7) 




.06 (92) .07 .37 (56) .12 


1.5 months 


.49 (189) .05^ 




.46 

(104) 


-53 

(42) 


.69 

(25) 


.34 

(18) 


.46 (104) .07 .54 (85) .08 


3*3 months . 


.63 (188) .05 




.56 

(41) 


.62 

(105) 


.69 

(12) 


.71 

(30) 


.56 (41) .11 .64 (147) .06 


6.6 months 


.90 (36) .12 




.88 

(18) 






.99 

(16) 


.88 (18) .16 ".99 (16) .18 


T2 months 


.01 (9) .23 






.01 






.01 (9) .23 


5 years 


.51, (57) ^ .09 


.29 

(14) 




.36 

(7) 


.24 

(7) , 


.72 

(29) 


.29 (14) .19 .58 (43) .11 



Ul 
4^ 
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Additional information on this question is presented in Table 18 which 
examined length of treatment by type of drug used. Only dextroamphetamine and 
methylphenidate had sufficient number of effect sizes to be considered. As can 
be seen, however, the same trend appears to hold. 

' Table 18 

Average Effect Sizes for Dextroamphetamine and Methylphenidate 
According to Length of Treatment 





Duration of Treatment 






Drug 


1 

month 


1.5 

months 


3.3 
months 


6.6 
months 


12 

months ' 


5 

years 


Dextroamphetamine 

/ 


-.15 

(36) 


.25 

(24) 


.67 

(71) 


.99 

(9) 






^Methylphenidate 


.22 

(75) 


.55 

(89) 


.70 

(45) 


.84 

(8) 




.49 

(57) 



Type of outcome . As noted in the beqinninq of this report, hyperactivity is 
characterized by a variety of problems. What is the effect of drugs on these 
various problems? This important question is answered to some degree by the data 
in Table 19 which examined the average effect size for different types of outcomes 
The outcomes included in the table range from general hyperactivity and general 
behavior toMndications of impulsivUy, attention/vigilance, and 10 and 
academic achievement measures. As can be seen, the most substantial effects 
are found for those outcomes which are most subjectively measured and hence, 
most suspect to bias. However, when we limit these studies to tHose that were 
of high quality, substantial effects are still present for general 
hyperactivity, general behavior, activity level, and academic achievement and 
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attention/viqi lance. Much lower effects are seen for aggression, impulsivity, 
IQ, and affective outcomes such as self-concept. Unfortunately, many of these 
estimates were based on rather small numbers of effect sizes. However, given 
the currently available data. Table 19 contains important information about 
the outcomes for which drugs are most effective. 

Table 20 should be considered in conjunction with Table 19. As can be . 
seen in Table 20, the average effect size for outcomes that are gathered via 
opinions is substantially higher than any of the other ways of collecting data. 
Those methods for gathering data which are least subject to bias such as 
systematic observation, actometers, and experimental tasks, show much lower 
effecc sizes in general than measures which are more subject to bias such as 
opinions and ratings. 

Degree of structure in setting where outcome measured . An important 
question has been raised in the literature about the degree to which the 
results of treating hyperactivity with drugs varies depending on the type of 
setting in which data are collected. Some researchers have suggested that 
drugs are primarily useful because they help the child to control their 
impulsivity and to remain on-task. These people have argued that in 
unstructured or free play settings, drugs may not have such a noticeable 
effect. If true, 'the lack of drug effect in unstructured setting would result 
because hyperactivity is more a problem of impulse control than of excessive 
activity; and unstructured settings do not require as much impulse control. 
Table 21 shows the results of studies in which children were observed in 
structured and unstructured settings. These data tend to support' the 
hypothesis that hyperactivity is more. a function of impulse control than of 
excessive activity. As noted, the average effect size obtained in structured 
settings is more than twice as high as the average effect size in unstructured 
settings. 



Table 19 



Average Effect Size for Different Outcomes Broken 
Down by Methodological Quality of Study 



Type of 


Overall ES 


• 


LJ4_U ^ 


• 


Quality of Re 


search Design 

High Qual ity 


Low Qual ity 
(3.4.5) 




n 


SEM 


1 


2 


3 


4 


5 


>.;> 


n 


SEM 


ES 


n 


SEM 


General 
Hyperactivity 


.64 


(175) 


.05 




.66 

f AQ\ 


.60 

(69) 


.73 

(■iU) 


.61 

\ C.n I 


,66 


(48) 


.10 


.63 


(127) 


.06 


General 
Behavior 


.74 


(105) 


.05 




.79 

(40) 


.63 

(40) 




.89 


.79 


(40) 


.11 


.72 


(63) 


.09 


Activity 
Level 


.64 


(43) 


.11 


1.21 

(6) 


.26 

(11) 


.71 

(20) 






.59 


(17) 


.17 


.66 


(25) 


.14 


Achievement 


.63 


(27) 


.13 




.62 

/ on \ 

(20) 








.62 


(20) 


.16 








Aggression 


.47 


(18) 


.16 




.20 


.46 

(8) 






.20 


(8) 


.25 


.46 


(8) 


.25 


Impul sivity 


.41 


(32) 


.12 




.21 

(11) 


.41 

(14) 






.21 


(11) 


.21 


.41 


(14) 


.19 


Attention/ 
Vigil ance 


.23 


(91) 


.07 




.45 

(2^) 


.38 


-.23 

(29) 




.45 


(23) 


.15 


.16 


(66) 


.09 


10 


.20 


(117) 


.06 


.27 

(10) 


-.10 

(50) 


.41 

(22) 


.05 

(16) 


.32 

(7) 


-.05 


(60) 


.09 


.33 

t 


(35) 


.12 


Affective 


.11 


(33) 


.12 




.15 

(22) 






.17 

(7) 


.15 


(22) 


..15 


.17 


(7) 


.26 
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Table 20 

Average Effect Size for Different Types 
of Instruments Used to Measure Outcomes 



Type of Instrument 




n 


SEM 


Opinion 


1 1 c 






Rating 


CO 


\ oDU ) 


nd 


ojrb LcHlat 1 L UUbciVaLIUfl 




(28) 


.13 


Actometer 


.49 


(34) 


.12 


Standardized Test 


.37 


(107) 


.07. 


Experimental Task 


-•08 


(123) 


.06 



Table 21 

Average Effect Size for Ratings Collected in 
Structured and Unstructured Settings 



Type of Setting 


Quality of 

High 
(1,2) 

n SEM 


Research 

Low 
(3,4,'5) 

n SEM 


Structured 


.79 (35) .12 


.77 (27) .13 


Unstructured 


.33 f (63) .09 


.55 . (97) .07 
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Methodological Notes 

In conducting a meta-analysis of this s^fcope, Some interesting findings 
regarding the methodology of research integration are often identified. One 
important finding that emerged from this study concerns the data used in 
computing effect sizes. The' notion of usincj a common metric for comparing 
outcomes across studies and types of dependent variables provides substantial 
flexibility and power in examining the results of previous research. 
Unfortunately, as pointed out in the Procedures section, many studies do not" 
report the means and standard deviations necessary for computing an effect 
size. In those cases, alternative estimation methods have been proposed and 
are frequently used in meta-analysis studies. These estimation methods employ 
various assumptions which are infrequently, if ever, checked. In Table 22, the 
average effect size obtained for different ways of computing effect sizes is 
reported. The fact that the average effect size for studies which reported 
means and standard deviations is much lower than studies in which effect sizes 
had to be estimated using one of the other approaches is concerning. These 
data suggest that when effect sizes may be somewhat inflated when they are 
calculated f rom or F ratios, t^ or F probabilities, or percentage of the 
sample exceeding "a given criterion (e.g., percentage improved). This question 
deserves further investigation. 



Table 22 

Average Effect Size for Different Ways of 
Constructing Effect Size Estimates 



Information Used to 
Construct Effect Size 




n 


SEM 


Ts and SD (either ' / 
control, pooled, 
publ ished) 


.29 


(452) 


.03 


H 

t or F ratio or 
probabi 1 ity 


.70 


(81) 


.05 


Percentage Improved 
("Probit" Transfor- 
mation) 


.82 


(120) 


'.06 
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Tdble 23 contains information regarding the relationship between authors* 
conclusions and computed effect size. Many critics of research literature have 
•suggested that authors place entirely too much import on the statistical 
significance of findings and often conclude that a treatment has worth based on 
statistical significance when thie educational importance of the finding is 
trivial. This argument is contradicted by the average effect size for those 
studies where authors concluded that the treatment worked, could not be 
determined, or did not work. Based on these averages/ it appears that most 
researchers have a fairly good understanding of what constitutes educational or 
clinical significance. 



Table 23 

Average Effect Size Associated with 
Different Conclusions by Author(s) 







n 


SFM 


Treatment works 


.86 


(297) 


.03 


Cannot determine 


.42 


(36) 


.07 


Treatment does 
not work 


.04 


(247) 


.04 



Summary 

The results of the meta-analysis suggest that drugs are a moderately 
'effective treatment for hyperactivity in children. Of those drugs available, 
methylphenidate appears to be the most effective, but the margin of advantage 
is very slim. Furthermore, a number of drugs which appear promising have not 
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been investigated sufficiently to draw firm conclusions as to the relative 
advantages of methyl phen id ate. 

TPie general conclusion that drugs are moderately effective in treating 
hyperactivity must be placed in the appropriate context. A fairly large number 
of variables were identified which indicate that the apparent benefits of drugs 
for treating hyperactivity are somewhat overestimated. For example, accounting 
for factors such as poor procedures in classifying children as hyperactive, low 
quality research designs, unreliability and bias in outcome measures, and 
suggested bias by those people supporting much of the research being conducted, 
all tend to reduce the obtained effect size of drug treatment. ^ 

The data also suggest that drugs are more effective with younger children 
than with older children, with boys than with girls, and when continued for 
longer periods. The greatest effect of drugs is for outcomes which are more 
generally defined. Those studies which have considered the effect of drugs on 
such variables as IQ, impulsivity, attention, have found much smaller effects. 
There is support in the literature for the' suggestion that hyperactivity is 
more a problem of impulse control rather than excessive activity because drugs 
are substanti al ly more effective in structured settings than in unstructured 
settings. 

Atthough the meta-analysis has done much to clarify the results of 
previous research on whether drugs are an eff^ective treatment for 
hyperactivity, many questions remai^n. The data are suggestive about the 
relative effectiveness of different drugs, but by no means definitive. Further 
research needs to be done comparing which drugs are most effective for which 
children. Also, the data regarding the degree of structure in which the 
outcome is measured raises important questions about "the definition and 
ideology of hyperactivity. These questions need further investigation. 

7i 



Much \)f the dita from the meta-analysis also suggest guidelines for future 
research. Although a good deal of research has been done well, much of the 
research on which the meta-analysis is based suffered. from one or more 
important problems. The results of the current meta-analysis suggest that 
future research should be more careful to control the following variables: 
bias .in the support of the research, bias in the collection of data, rigorous 
classification of children as hyperactive, the use of placebos, and procedures 
for assuring that implementation has occurred. Finally, the meta-analysis data 
suggest that guestions such as age, sex differences, and length of treatment 
ought to be investigated further. Of particular importance in such ongoing 
investigations would be longitudinal studies and follow-up studies of children 
who have previously received drug treatment for hyperactivity. 
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Review Articles 
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Review References 


Type of 
Sample • 


Method of 
Selection 
Spec1f1ed?b 


Nuff^er of 
Primary 
Studies 
Cited c 


Previous reviews 
cited, critiqued, 
and expanded?^ 


Outcomes of individ- 
ual studies r-eported 
In terms of^ 


How were 
concomitant 

variables- 
considered?^ 


Adelwan, H. S., i tompas, B. E. 
Stimulant druqs and learninq problems. 
Journal of Special Education^ 1977, 11, 
377-416. 


Convenience 


No 


cl 


Nn 

no 


Statistical signlf- 

ir xnro * Hi f f pr pnrpc 

Study by study; 
brief differences 


Not 
considered 


Allen, ft. P., safer, D., & Covi, L. 
Effects of psycbost imulants on 
aqqression. The Journal of Nervous and 
Mental Disease, 197b, 160. liii-145. 


Convenience 


No 


3 


No 


Differences study 
by study 


Not 
considered 


Bakwin. H. Bertzedrine in behavior 
disorders of Children. The Journal of 
Pediatrics, 1948, 32, 215-216. 


Convenience 


No 


3 


No 


Differences study 
by study 


Not 
considered 


Ba^'kiey, K. A. A review of stimulant 
druq research with hyperactive children, 
tburnal of Child Psychology and 
Psychiatry, 1977, 1&, 137-165. 


Convenience 


No 


37 


No 


Brief differences 


Not 
considered 


Barkley, R. A. Predictinq the response 
of hyperkinetic children to .stimulant 
druqs: A review. Journal cf Abnormal 
Child Psychol-gy, 1976, 4, 327-348. 


Convenience 


No 


78 


Yes 


Brief differences 


Nnt 

considered 


Barkley, R. A. Recent developments in 
research on hyperactive children. 
Journal of Pediatric Psychology. 1978, 
3, l5d-l6J. 


Convenience 


No 


0 


No 


Not reported 


Not 
considered 


Barkley, R. A., ft Cunninqham, C. t. Do 
stimulant drugs Improve the academic 
performance of hyperactive children? 
Clinical Pediatrics, 1978, 17, 85-92. 


Convenience 


No 


42 


No 


Statistical signlf- 
1 cance y QiTTerence> 
study by study; 
brief differences 


subset 


Bradley. E. Academic, behavioral, and 
psychological responses of hyperactive 
children to stimulant medication. 
Unpublished master's thesis. 
Northeastern Illinois University, August 
1975. {ED 116413) 


Convenience 


No 


cS 


no 


Statistical signif- 
icance; differences 

a L U U Jf uj a !• u u jr 1 

brief differences 


\ 

Not 
cons idered 


Bower, K. B. Hyperactivity: Etiology 
and intervention techniques. The 
Journal of School Health, 19757^. 
195-202. 


Convenience 


No 


18 


No 


Statistical slqnif- 

ll.aiil.c, aiii^ic 

subject; differen- 
ces study by study; 
brief differences 


Not 
considered 


Calhoun, G. Hyper act tve jemotional ly 
disturbed and hyperkinetic learning 
0 1 S an 1 1 1 u 1 es . n cnaiienge lor tnc 
reoular classroom. Adolescence, 1978, 
13, 335-338. 


Conven ifince 


No 


1 

i 


No 


Differences study 
by study 


Not 
considered 



O Key ^or .iLerpretIng each column appears at the end. 
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Type of 
Sample ^ 


Method of 
Selection 
Specified? ° 


Number of 
Primary 
Studies 
Cited^ 


Previous reviews 
cited, critiqued, 
and expanded?^ 


Outcomes of individ- 
ual studies reported 
in terms of* 


nun nci c 

concomitant 
variables 
considered? * 


Cole, 0. Hyperkinetic chndren: The 
use o' stimulsat drugs pvaluated. 
Aiit>rtr^n Jaurn.O of "rt^.^psych i atry, 


Convenience 


No 


11 


No 


Single subject; 
<lifferences study 
by study; brief 
differences 


Not 
considered 


lonners, C. K. npcent ^ruq studies with 
hyperkinetic children, vburrnl of 
Learning Oisabnuies, IW^^T^T^'^^^- 


Convenience 


No 


14 


Yes 


Statistical signif- 
icance; differenced 
study by study; 
brief differences 


Not 

consid^ered 


Cronin, J. ?. The use of psychopharma- 

c* imii ^r*rl^c ^niT fhp control Of 
childhood hyperklnesis. Minneapolis, 
Minnesota: University of Minnesota, 
1976. (E^IC Document Reproduction 
Service No. ED126 654) 


Convenience 


No 


4 


No 


Statistical signif- 
icance; single 
subject; differen- 
ces study by study; 
brief differences 


Not 
considered 


Cunninqnam, C. £., ^ Berkley, R. A. Ihe 
role of acad»^tc failure in hyperactive 
behavior. Journal of Learninq 
Disabilities, 1978, Jl, 15-21. 


Convenience 


No 


17 


No 


Single subject; 
differences study 
by study; brief 
differences 


Not 
considered 


Tnnnarf;, c K PhariTiacother ao V of 
psychopathploqy In children. In H. Quay 
h A. Werrv (Eds.K Psychopathologica] 
disorders of children. New York: Wiley 
I Sons, 19/2. 


Convenience 


No 


84 


No 


Statistical signif- 
icance; differences 
study by study; 
brief differences 


Not 
considered 


DeLong, A. R. What have we learned from 
psychoactive drpjq research on hyper- 
actives? American Journal of the 
Disabled Child, 1972, 123, l7/-i8U. 


Convenience 


No 


0 


No 


Not considered 


Not 
cons idered 




Dyck. N. J. Educational management of 
kon£i' «/>f iuo ^htlrlron In M *! . Fine 

nypCi aC w iVC LUIIUiCM. IIP II. *f 

(Ed.)* Principles and techniques of 
Intervention with hyperactive children. 
Springfield, Illinois: Charles C. 
Thomas, 1977. 


Convenience 


No 


12 


No 


Statistical signif- 
icance; differences 
study by study; 
brief differences 


Not 
considered 


ricanKnr>n I I. fnnnpr^ C K. PsVChO" 

pharmacology in childhood. In N. B. 
Talbot, J. Waqan, Si L. Eisenberg 
(Eds.). Behavioral science in pediatric 
medicine. Philadelphia: baunders, 19/1. 


Convenience 


No 


39 


No 


Statistical signif- 
icance; single 
subject; differen- 
ces study by study 
brief differences 


Not 
considered 


Tpstein. E. P., Marrinqion, N. 0., 
Meagher, J. A., Rowlands, E. L,, i 
Simons, R. K. Chemotherapy and the 
hvnerkinetic Child. Journal of 
Education, 1968, 1^, 47-60. 


Convenience 


No 


1 


No 


Differences study 


Not 
considered 



■'"'Key for interpretlnq each column appears at the end. 
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Type of 
Sample * 


Method of 

Selection. 
Specified?" 


Numh^r of 

Primary 
Studies 
Cited^ 


Previous reviews 
cited, critiqued, 
and expanded? 


Outcomes of individ- 
ual studies reported 
in terms of 


How were 
conco«»itant 

variables - 
considered? ' 


Tefqhner, ^. t., S feighnier, J, TT 

hyperkinetic child. American Journal of 
Psychiatry. 1974, 131, 459-463. 


Convenience 


No 


9 


No 


Brief differences 


Not 
considered 


Fischer, K. C, h Wilson, W, P. 

M^kthu^ nHi&n l^a an.-4 thi> h vnprif *{ n»> t { T 
nCT tny 1 pficn 1 uaLc anu tnc ujr^'cr ^ iiici. iv. 

State, Dis«*asps of tbe Nervous System, 
1971. 695-698. 


Convenience 


No 


8 


No 


Differences study 
by study 


Not 

considered 


r 1 >n » o. uruy use in psycnsaLric 
disordtTs of children. American Journal 
of Psychiatry, -1968, 124, 31-36. 


Convenience 


No 


7 


No 


Statistical slonif^ 
icance; brief 
differences 


Not 

considered 


Freeman, R. 0. Drug effpcts on learning 

111 Ciiiiurdi* 3t.ici.Livr? review ui i»nc 

past thirty years. The Journal of 
Special Education, 1956. 1, 17-44, 


Convenience 


No 


26 

'\ 


No 


Statistical sidnif- 
icance; differences 
study by study; 
brief differences 


Not 
considered 


TVeSnTin ""TT" n Rpvi^w of medicine in 
Special education. Journal of Special 
Education, 1970, 4, 377-384. 


Convenience 


No 


9 


No 


Differences study 
by study; brief 
differences 


Not 
considered 


eiennon, C. A., & reason. D. E. Manaqing 

^nc ucnoviur I'r tiic iijrj-'ci tivc uiiiiu. 

What research says. Reading Teacher. 
1974, 27, 815-824. 


Convenience 


No 


3 


No 


Statistical slanif- 
icance; differences 
study by study; 
brief differences 


Not 
considered 


Grant. D. R. Psychopharmacology in 
childhood eniotional and frental 
disordf'rs. Journal of Pediatrics, 1962, 
61. 626-637. 


nOSl 

research 


No 


62 


No 


Icance; differences 
study by study; 
brief differences 


Not 
considered 


Grin spoon, L.. 4 Singer, S. 8. 

niTijJ firr I di.i J rieb in Lilc t'CaciiiCiit ui 

hyperkinetic children, fiarvard 
Educational Review. 1973, 43, 515-555. 


Convenience 


No 


15 


No 


Statistical signif- 

\cjir\e'Q' ^Innlp 

subject; differen- 
ces study by study; 
brief differences 


Not 
considered 


Haviqhurst, R. J. Choosing a middle 

Dtith fnr th^ ii<;p of dnin^ with 

hyperactive children. School Review, 
1976, R5, 61-77. 


Convenience 


No 


2 


No 


Single subject; 
differences study 
by study 


Not 
considered 


Hirst, I. Effects of the psychoactive 
drug: Hethylphenidate (Ritalin) on 
classroom disorders: Hyperactivity, 
emotional disturbance, and learning 

disorders. Paper presented at the 
Anniiitl Interndtinnal Convention^ The 
Concil for Exceptional Children, 
Chicago. April 1^'6. 


Convenience 


No 


3 


No 


Differences study 
by study 


Not 
considered 



Key for -nterpretlng each column appears it the end. 
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CHARACTERISTICS OF HYPERACTIVITY REVIEWS (P. 4) 











Type of 
Sample 


Method of 
Selection 
Specified?" 


Number of 
Primary 
Studies 
Cited 


Previous reviews 
cited, critiqued, 
and expanded? 


Outcomes of Individ- 
uai StUQies reporteo 
in terms of^ 


HOW were 
concomitant 

considered? 


Keaqn, &. H>Merict{vUy and learninq 
disorders: Review and speculation. 
Exceptional Children, 1971, 38, 101-109. 


Convenience* 


No 


5 


No 


Differences study 
by study 


Not 
cons idered 


ITorm'fsky, C. Psycnoactive liruqs in the 
iTiiture orqjnism. Psychopharmacologia, 
1970, 17, lOb.136. 


Convenience 


No 


16 


No 


Statistical signif- 
icance; differences 
study by study 


Not 

/*nnc iHovoH 
CUiib lUcrcu 


Lamhe^'t, H. , windmiller, M. , 
Sandoval, J., & Moore, B. Hype'-active 
rk\\\f*.i>f\ Aiiil thp pfficacv of 
psychoactive .druys as a treatment v 
intervention. American Journal of 
Orthopsychiatry, 1976, 46, 335-352. 


Convenience 


No 

S 


37 


No 


Statistical signif- 
icance; differences 
study by study; 
brief differences 


Not 

considered • 


1 ;iii/pr M U npnhOff E . h SOlOHIOnS* 

6. Hyperkinetic impulse disorder in 
children's behavior problems. 
P^vrho^iomat ic Medicine. 1957. 19, 38-49. 


Convenience 


No 


4 


No 


Differences study 
by study; brief 
differences 


Not 
considered 


Lrsser, L. L. Hyperkinesis In children. 
Clinical Pediatrics. 1970, 9, 548-552. 


Convenience 


No 


2 


No 


Brief differences 


NOv 

considered 


[ irmAn R ^ support of 
research In minimal brain dysfunction 
and other disorders of childhood. 
Psychopharmacoloqy Bulletin, 1973, 1-8. 


Convenience 


No 


13 


No 


Differences study 
by study; brief 
differences 


Not 
cons loerea 


LiDton. M. et a1. Report to the 
Nutrition Foundation. New York: The 
Nutrition t^oundation, Inc^, 1975. 


Convenience 


No 


2 


No 


Differences study 
by study 


Not 

cuns (□crCll 


Hdoonald, J. t. Pharmacologic treatment 
Ami h^hAvlor theraDv* Allies in the 
manaqement of hyperactive children. 
Psychology in the Schools, 1978, 15, 
275.274. 


Convenience 


No 


11 

\ 


No 


Statistical signif- 
icance; single 
subject 


Not 
considered 


MillirhAD J 6. & Fowler. G. W. 
Treatment of "minimal brain dysfunction* 
syndromes. Pediatric Clinics of North 
America, 14(4), 1967, 767-///. 


Convenience 


No 


36 ' 


No 


Percent Improved 


Logical for 
subset 


FTira, M. , I Reece, L. a. nedicai 
manaqement of the hyperactive child. In 
M. J. Fine (Ed.). Principles and tech- 
nique's of intervontion with hyperactive 
clnldron. Spr inqf leld, Illinois: 
Charles C. Thonas, 1977. 


Convenience 


No 


17 


No 


Statistical signif- 
icance; difference* 
study by study; 
brief differences 


s Not 
considered 


*"^Key for interpreting each column 

ERLC 


appears at the Md. 



laS (p. 



-Si 



Rt>v Ref erenctfs 

O'Leary, K. 0. Pills O'^ skiur^for 
n>peractive children. Presidential 
address to Clinical Division, 
Ejtper imental Behavioral Science, 
/Vnerjc an Psychological Association, 
Toronto, Canada, August 1970. 



Patterson, G. R. Behavioral interven- 
tion procedures in the classroom and in 
the home. In A. E. Berqin & S. L. Gar- 
field (Eds.), Handbook of Psychotherapy 
and Behavior Change . New York: W1 ley & 
Sons, l9TL~* 



Prout, H. T. Benaviora] intervention 
with hyperactive children: A review. 
Journal of Learning Disorders , 1977,. 
10(3), 141-14Z. 



Ross, 0. M., h ROSS, b. D. 

Hyperactivity: Research, theory, and 
action . New York: rtiley, Interscicnce 
Publication, 1976. 



Safer, 0, J, Drugs for problem school 
Children. The Journal of School Health , 
1971, 41, 491-495. 



Type of 
Sample 



^^^thod of 
^ Selection, 
Specified?^ 



Convenience 



Methodological 
superiority 



Convenience 



Convenience 



Convenience 



No 



No 



No 



No 



No 



Number of 
Primary 
Studies 
Cited^ 



29 



33 



14 



120 



Previous reviews 
cited, critiqued, 
and expanded? 



No 



No 



Yes 



No 



No 



Outcomes of individ- 
ual studies reported 
in terms of 



Statistical signlf 
icancc; differences 
Itudy by study; 
brief differences 



Single subject; 
differences study 
by study; brief 
differences 



Statistical signlf 
icance, time 
series; differences 
study by study 



Statistical signlf- 
icance; single , 
subject; diff eren 
ces study by study; 
brief differences 



Statistical signlf 
icance; differences 
study by study; 
brief differences 



How were 
concomitant 

variables^ 
considered? 



Not 
considered 



Not 
considered 



Not 
considered 



Not 
considered 



Not 
considered 



5chrager, J,, Harrison, S., McOermott, 
J., Wilson, P., & Lindy, J. The 
hyperkinetic child; an overview of the 
issues . Ann Arbor, Michigan: The 
University of Michigan Medical Center, 
Department of Psychiatry, 1965 
(estimated from latest references in 
bibi iography). 



Convenience 



No 



No 



Brief differences 



Not 
considered 



bieben, k. l. Controversial medical 
treatment of learning disabilities. 
Academic Therapy , 1977, 13, 133-147. 



Convenience 



No 



No 



Single subject; 
difiperences study 
by study 



Not 
considered 



Silver, L. H. Acceptable and controver- 
sia^ apprx)aches to treating the child 
with learning disabilities. Pediatrics , 
1975, 55, 406-415. 



Convenience 



No 



15 



No 



Differences study 
by study;- brief 
differences 



Not 
considered 



Sprague, R. L., & Werry, J, 5. 
Methodology of psychopharmacological 
studies with the retarded. In N. R. 
Ellis (Ed.), International Review of 
research in mental retardation (Vol. 5). 
New York; Academic Press, 1971, 147-219 



Most 
research 



No 



184 



Yes 



Statistical signif 
icance; single 
subject; differen- 
ces study by study; 
brief differences 



Logical for 
subset 



a-fKey for interpreting each column appears at the end.- 




' CHARACTERISTICS OF HYPEW«;TIVITV REVIEWS (P. 6) 









Rev u»w Rt'f 'TfM t t»s 


Type of 
Samp>e* 


Method of 
Selection* 
Specified?^ 


Number of 
Primary 
Studies^ 
Cited^ 


Previous reviews 
cited, critiqued^ 
and expanded? 


Outcomes of individ- 
ual studies reported 
iff^terms of 


HOW were 
concomitant 

variables ^ 
considered? 


Spring, C, I ^anjoval,"J. rioJ 

dd0itiV6S iHu nyptfrXineS 1 S . M critlu<Ji 

evaluation of the evidence. Journal of 
learning Disabilities, 1976, 


Most 
research 


No 


6 


.No 


Statistical signif- 
icance; differences 
study by study; 
brief differences 


Not 
cons idered 


Sroufe, L, A., ii Stewart, M. A. 

T^^^^inwt f«a*/«K 1 Am fKiil/l»*an tall t h ^timtilAnt 
ireavinQ proDieni cniioren wiijn at uiiu lam, 

druqs. New'EnglanJ Journal of Medicine, 

1973, 289, 407*413. 


Convenience 


No 


30 


No • 


Differences study 
by study; brief 
differences 


Not 
cofvs Idered 


Swift, M. S., J» Spwack, G. Therapeutic 
teaching; A review of teachinq methods 
for oehaviora'iy trouDieo cnMar»?n. \ ne 
Journal of Special Education, 1974, 8, 
259-23^. 


rnnv^n i @nce 


No 


28 


No 


Differences study 
by study; brief 
difference^ 


Not 
considered 


Taylor, E, FooQ additives, allergy, and 
hvocrkinesis. Journal of Child 
Psychology, 1979, _20. 3t)/-JbJ. 


Convenience 


No 


8 


< 

No 


Statistical signif- 
icance; differences 
study by study; 
brief differences 


Not 
considered 


ToFTessen, J., ft Karowe, H. E. A role 
for the school in the pharmacological 
treainent or nyperK i^rL it tiiiiurcii. 
Psvchaloqy in the Schools. 1969, 6, 


Convenience 


No 


10 


No 


Statistical signif- 
icance; differences 
study by study; 
brief differences 


Not 
considered 

• 


Weiss, G. , 01 necntnian, l. me nyper- 
active child syndrome. Science, 1979, 
205, 1348-1354. 


Convenience 


No 


23 


No 


Statistical signif- 
icance; differences 
study by study; ' 
brief differences 


Not 
considered 


werry, J. >>. ueveiopinenLa i 
hyperactivity. Pediatric Clinics of 
North America, 1968, 15, 58l-b99. 


Convenience 


No 


13 


No 


Differences study 
by study; brief 
differences 


Not 
considered 


WilManis, J. I., i Cram, D. M. Diet In 
tne management of hyperkinesis. 
Canadian Psychi«<tric Association 
Journal, 1978, 23, ^41-248. 


Convenience 


No 


10 s 


No 


Differences study 
by study 


Not 
considered 


Wolraich, M, L. oehavior moaiTication 
therapy in hyperactive children. 
Clinical Pediatrics, 1979, 18, 563-570. 


Convenience 


No 


1 


No 


Brief differences 


ttot 
considered 


Wolraich, M. L. stimulant drug therapy 
In hi/noff* t i v«i rh^lHriin* Research and 
clinical imol ications. Pediatrics, 
1977, 60, 512-518. 


Convenience 


No 


62 


Yes 


Differences study 
by study; brief 
differences 


Logical for 
subset 


3"ClCey for interpreting each column appears at the end. 

% 
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• • • • ' # • # • • 

DURACT ERISTICS OF HYPERACTIVITY REVTEWS~(P. f) 



t * 

Review References 


Type of 
Sample 


Method of 
Selection. 
Specified?^ 


Number of 
Primary 
Studies 
Cited^ 


Previous reviews 
cited, critiqued, 
and expanded? 


Outcomes of individ- 
ual studies repQrted 
in terms of 


How were 
concomitant 

variables 
considered? 


Wun^erilch, R. c. treatment of the 
hYDer^ctive "child. . Academic Therapy, 
1973, 8, 375-390. 


Convenience 


No 


0 


■ No 


Not considered 


Logical for 
subset 


Zentall. 5. 5. tnvironmental 
Stimulation model. Exceptional 
Children. 1977. 43, 502-510. 


Convenience 


No 


26 


No 


Brief differences 


Logical for 
subset 


ZentaiK ^. Optlmii stimulation as 
theoretical basis of hyperactivity. 
American Journal of Orthopsychiatry, 
1975, 45, 549-563. 


Convenience 


No 


32 


No 




Differences study 
by study; brief 
differences 


Logical for 
subset 



*The type of sample was coded as convenience, methodologically superior , representative , or comprehensive . Decisions about the type 
of sample included in each review were somewhat subjective. If the review was based on a limited number of studies and gave no procedures 
for the selection techniques used to assure representativeness, it was- assumed that the sample'was a convenience sample. If procedures had 
be€n used to assure a representative or .comprehensive sample, we assumed the author would have mentioned them. Samples coded as 
methodologically super iqr were described as such by the author (s).^ 



^To be coded "yes," the specific procedures used to identify and select articles for the review had to be described. It was not 
enough to say, "articles were limited to research on home-based programs," ^ince the procedures for identifying and selecting home-based 
articles are not specified. * 

^Nurabers cited in this column represent only those empirical research articles used to support a contention about the effectiveness 
of a treatment for hyperactivity. All articles listed in the article's bibliography or reference section are not listed. 

^To.be coded "yes," a review article had to cite previous review articles and critique them and^ explain how the Current review would 
differ from or expand on previous reviews— all three were necassary. 

o 

*The way In which the outcome of each study was reported in the review was coded as Effect Size (i.e., any kind of measure which 
could be compared across studies like Glass's ES, eta [n] squared, r2, or omega [ ] squared); Statistically Significant (i.e., the 
statistical significance whether in favor or against the particular treatment reported for each study; Differences (iT?., studies were 
considered Individually and differences found were reported but without reporting an ES measure or statistical significance); Brief 
Differences (i.e., differences found by studies were reported in groups instead of study by study); or Single Subject Designs . Entries in 
the COiiinn represent the most frequent way(s) of reporting outcomes of individual studies for a particular revlewT 

« 

"^The way in which the review considered how study characteristics covaried with outcoTie Was coded as Systematically (data based) 
(i.e., the covariation was examined where possible for alKstudies in sample); Logidfll for Subset (i.e., covariation was discuss^ for 
substantial nunber of studies in review but not using data-based approach); or Not consiTfared . 
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Appendix 2 
Coding Sheet for Efficacy of Drug 
Treatments for Hyperactivity 
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84 ^ 



75 



Question for G^* e^tpUIn 
on back 



Coder: 



META-AMALVSIS OF HVPESACTIVITY 
Coding fnstrunient 



(Authors) 



(Title) 



14.16) 
17-19) 

(20) 
(21) 

22-25) 
26-29) 

(30) 
(31) 

(32) 
(33) 

(34) 

(3S) 



I 



(M) _ 

(5.6) 

(7-10) _ 

(U) _ 

(12) _ 

(13) .... 



. 1) Study 10 I 
. 2) Year 
, 3) Source 

, 4) Supported by coinnercUl company 

(I » yes, complete; 2 • yes, partial; 3 ■ no; 
. 5) Dissertation 

(I • dissertation article w/ ES estimated; 0 
6) Side Effect (I • yes;.0 • no) 

(sprcify) 



4 » not stated) 
■ all others) 



Check List 

. I. references needed from this 

article checked 
. 2. computations for ES shown 
. 3. every blank marked 
. 4. test names listed for E5*s 
, 5. references of test articles to 

Bev 

, 6. comments about conventions' 
, 7. disagreement about conventions 
, 8. "coded" written on article 
. 9. references and additions for 
future mini meta-analyses 



II DESCRIPTION OF SAWLE 



ES6 


ESS 


ES4 


ES3 


ES2 


ESI 


1. 




























2. 




























3. 




























4. ! 




























5. S 




























6'.. 


















i 
1 


\ 

. ! 


1 

r 









^E^N AGE (M»mfi) 

Experimental J- ■ not given) 
Control (- ■ not given) 
^EW^ IQ (I ■ 130*, 2 • 71-12! 
Experimental (- • not given) 
Control (- " not given) 



(test) 



SIZE 

Experimental {- ■ not given) 
Control (- • not given) 



Experimental 
Control 



I (how SES determined) 



3 • severe, 4 • extreme, 5 • mixed, - • unable to tell) 



Experimental 
Control 

DIAG^^osIS 

Experimental 
Control 



1 ■ normal 

2 ■ hyperactive/hyperkinetic 

3 • H80 

4 • LO 

5 - EO 

6 • attention deficit disorder 

7 » other 

(s&ec^fy) 
• • unable to tell 

(1) 



0 



76 



CS6 



ESS IS4 CS3 ES2 ESI 



55- 3e) 

39-41) 

45-47) 

48-SO) 
31-S3) 

54-56) 
57-56) 

59-61) 
62-64) 

.65-67) 
68-70) 

.71-73) 
: 74-76) 



: 77-79) 
(lO) 
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(4-6) 

(7-9) 

:iO-12) 

;i3-i5) 
:i6-ia) 

.19-21) 

:22-24) 
25-27) 

,26-30) 
Jl-33) 



7. X HttlDlC^PPEB (coJe . -5 • "^••e* 
A. HjlTIPLE 

Experimental 

Control 



exict ' not known, - • not given) 



6. Deaf 

Experimental 

Control 

c. Blind 

Experimental 

Con tro 1 



\ 



0. fH 

Experimental 

Control 



E. LEARNlNG/I^CEPTIOflA'lNOR NpJTO^ 
signs, minor E£G abnormalities} 



Problem neurological 



igns 
Experimental 
Control 



F. Gross HeUROUXSICAL ^^BLEMS (obvious physic tr.u... gross EEG *bnorn«Mt,es) 



Experimental 
Control 



G. ED (neurosis/psychosis) 

Experimental 

Control 



s, % Institutionalized (code t. -9 ■ some, - « not given) 

Experimental 



Control 



9. Z MlNORITV (bl.ck. hispanic. .nd/or Ir^lgr.nt) (cod. 1. -9 - so«. - ■ not given) 
Experimental 
Control 



10. I Hale (code -9 ■ sow, - • not given) 

^ Experimental 

Control 



11. t !NUBJEaS ON AaiVE DRUOS^^^**^^ 
^^^^ Experimental 
Control 



time from list) (code -.9 • some, - - not given) 



12. Z Allergies (code -9 ■ so« - ■ not given) 

■ Expcrimerttal _ 

Control - 



specify allergence 
substance 



13. X DSLINCUErn" (code -9 • sore, - ■ not given) 
_ Experimental 
_ Control 

(2) 

86 



V 



77 



34-26) 
37-33} 
40-42) 
43*45) 
46-4a) 
49-51) 



52-54) 
55-57} 
58-60) 
6i-63) 
54-66) 
67-69) 



(70) 
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ESS 



ES4, ES3 



ES2 



£51 



III CU>.SSIFIWTir*{ PF OlILra'n 

AS mpf^cim ! 



Code four Jigit # for «11 selection methods used 
1. Cxperirnental Group 

4) .... specify iny -other" , 

b) .... specify any "other" 

c) .... specify any "other" 

d) .... specify any •*other** 



2. Control Group 

a) .... specify any •'other" 

b) .... specify any "other" 

c) .... specify any "other" 

d) .... specify any "other** 



3. General Quality of procedures Used to 
Classify Children as Hyperactive 

1. good: objective measures by 2 or more 

people or In 2 or more settings 

2. fair: objective measure used by only 

1 person In 1 getting - may or 
may not use additional subjective 
measures, 

3. poor: Only subjective measures used 

4. unable to tell 



Specify names of any instruments used 
to classify children as hyperactive 



2. 
3. 
4. 



(3) 



1st Oigit ' Basis for Selection 

1 ■ General Hyperactivity 

2 " Activity Level 

'3 ■ Attention/Vigilance 

4 ■ Aqoressio'n 

5 • Impulsivlty 

6 ■ Impossible to Determine 

7 ■ Other 

8 ■ Chilclren were not hyperactive 
2nd Digit - InstruTent 



1 - 

2 • 

3 ■ 
4- 

5 • 

6 ■ 

7 ■ 

8 - 

3rd 

1 • 

2 • 

3 • 

4 " 

5 - 

6 - 

7 ■ 

8 • 

9 • 



Opinion 
Rating 

Systematic Observation 
Actometer 

Perceptual -Motor Test 
experimental Task 
Impossible to Determine 
Other 

Digit - Cvaluator 



4th 

1 • 

2 • 

3 • 

4 ■ 

5 • 

6 • 

7 • 

8 • 

9 • 



Teacher 

Parent 

Physician 

Researcher 

Observer 

Clinician/Psychologist 
Caretaker 
Impossible to Tell 
Other 

# 

Digit - Setting 



School . 
Ho(7;e 

Clinic/Doctor's Office 
Experimental Setting 
Day Care 
Institution 
Corposf te 

Impossible to Seternine 
Other 



78 



1 


£S6 


tS5 


ES4 


ES3 


ES2 


tSl 


j IV TFEAimiT J 


(71) 














I) Treatficnt type (2 • only, I • I of several, 0 • no) 
Drug 


r 

t'2) j 




^ 








Behivioral 


(") 




1 








Ofet 4 


(74) 














Biofeedback 


1'5) 














Comparison of Groups 


(76) 














2) Control Group 
A) Selection 


(77) 














1 " random 

2 " convenience 

3 " matched 

4 ■ own 

5 ■ Crossover 

6 ■ Impossible to determine 

7 ■ children in control group nonhyperactive 
a ■ other 

B) Treatment 


(78) 








• 






1. None 

2. Placebo 

Quality 


(1-4) 














4 " present, can't tell 
3 " good 
2 • poor 
1 ■ 

0 " absent 

3. OruQ {soecify) 

4. Behavioral (snecifvl 

5. Diet ^5oecifvl 

6. Biofeedback (sDecifvl 

7. Impossible to determine 
B. Other (soecify) 

3) Duration of treatment in daysC'-" ■ impossible to determine; ;1 ■ N/A) 


(S-8) 














<) Days after treatment dependent variable observed {- • not given; -1 ■ 


(9) 














5) Reliability of treatment implementation 


(10) 














1 rnmn 1 • f • 4mn1 Miwnf A f nn 

1 udnpiCbC inip 1 cfiicnbci k 1 un 

2 minor problems 

3 moderate problems 

4 major problems 

5 impossible to determine 

6. no treatment given—comparison of groups 

6) Confidence with which IV, 95 was coded 














1 data based 

2 guess based on data 

3 convention 

4 imoosslhle to determine 
S. not appilcaole 



i 



(4) 



79 



V CESIffi 



U Type 



1 • randan assignment 

Z " non-raiidom but matched 

3 • convenience 

4 • pre-post no control 

5 • sinqle subject/case study 

6 ■ crossover 

7 ■ other _^ 

TspecTTyT 



2) BUnding (I • yes. 2 ■ probably, 3 ■ no, 4 • can't tell) 
Subject 

Treatment Impleitientor 
Data Gatherer 



VI THRFATS TO VPilDITY 



I. Thrvats to validity 

A) Maturation 

B) History 

C) Testing 

0) Instrumentation 

C) Statistical Regression 
Selection Bias 

G) experimental Mortality 

H) Novelty and Disruption 

1) Experimenter Effect 

J) Inappropriate Statistical Procedures 
K) Other 



0 " not a plausible threat to the stud. 

Internal validity 

1 " potential minor problem In attrlbu 

the observed effect to the treatmet 
by itself* not likely to account fc 
substantial portion of observed re$ 

Z - plausible alternative explanation i 
could account for substantial amoui 
of the observed results 

3 - plausible alternative explanation \ 
by itself could explain most or al' 
the observed results 



(specify) 

General index of validity 

(I " high « — » 5 " low; code from conventions) 



8u 

(5) 



80 



' £S»4 
ES«5 



• ES^6 \ ^ 

Outcoirie Used (Tesi. rume) 



(32) 



(3*3) 
134) 



) 


! ESS 

♦ 


£S4 


£S3 


£S2 


ESI 


VII mrroH 

1) Type Of measure 














1 " General hyperactivity 

2 " Cognitive perfonnance 

3 * Attention and vfaltAnfft 

4 • Physiological 

5 • Affective 

6 « Activity Level 

7 ■ Aggression 

8 " Impulsivity 

9 * Achievement 
10 • Other 

(specify) 

2) Instrument 














1 « Opinion 

2 ■ Rating 

3 « Systematic observation 

4 ■ Actometer 

5 « Standardized Test 

6 « txpcriitictttjt Tosk 

7 * Impossible to determine 
6 « Compositt 

9 ■ Other 

(specify) 
3) Oiti Collector 










' ? — 




1 ■ Teacher 7 ■ Cllnlclin/Psycholoqist 

2 ■ Parent 8 ■ Caretaker 

3 ■ Physician 9 ■ Subject 

4 ■ Researcher lO » Compositt 

5 f Observer n « Other 

6 « Counselor 

4) Setting wnere measured 














1 • School 6 ■ Institution 

2 ■ Kome 7 ■ Composite 

3 • Clinic/Doctor's Office 8 ■ Imoosslble to determine 

4 ■ Day Care 9 ■ Other 

5 ■ Exoerimentil Setting * 

S) Rellibllity (-•impossible to determine. 1*1.0 • .80. 2 « .80 - .60. 3 " 60-) 




1 










, 6) Treatment Imolenentor was Data Collector (1 ■ yes. 2 ■ no.- •.irpossible to determine) 




i 




1 


7) Setting was I ■ structured. 2 ■ unstructured, 3 ■ nijtea, litiposslble to deter;mnc 
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ES5 ESS tS4 t E53 . £S2 Ibl 



(33-35) 
{39.4G) 



(42.44) 

(45) 



. 1. Effect Sue 
2. Data froai which ES was calculated 



1 



4 

5 

6 
7 

6 

9 

10 
11 
12 
13 



iiteans dnd control groups SO 
(cede scale of means from list) 
means and pooled SO 
(code scale of means from list) 
means and published test SO 
(code scale of means from list) 
t ratlo/F ratio from one-way ANOVA 
r ratio from matched pairs 

t, test or F ratio from repeated measures or other complex ANOVA design 

non-parametric test statistic except Chi squared 

probability estimate for\t test • 

or one-way ANOVA ^ 

S of V Table frojii n-way ANOVA 

S of V Table from ANCOVA, 

repeated measures, or ether complex ANOVA design 
Regression lines 

Proportions (**probif* transformations) 
Chi sQuare 

Other 

(specify) 



Scale of Kean Olfference 

(If 12 coded 1, 2, or 3, code froM 1-6 below) 



final status measure 
raw gain scores 
residual gain scnres 
covarlance adjusted scores 
impossif>le to determine 
other (specify) 



• if #2 was coded 4*ir 



4. Statistical Significance (code p value. - • not given] 



Author's Conclusions 



0 ■ not considered 

1 • intervention appears to work 

2 • 'impossible to determine 

3 • intervention appears not to work 



£S1 



(8,9) 



1 

(10) 
(11-14) 
(15-18) 

(19) 

(20-24) 

(25) 
(26-29) 
(30-33) 
(34) 
(35-39) 



— i- — - 
i 
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IX DRUG STIBIES 
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study I 
C«rd I 

1. Type of drug used for experimental group (code from list) 

2. Type of drug used for control group (code from list, 0 

50 • placebo) 



3. 



4. 



5. 



g. 



10. 



11. 



no drug; 



12. 



E'n°dTng"da 1y average for experimental group (0 . varied; -■ not give.; 
dosage always In mg/kg. use decimal point with form -.-) 
Number of times drug was administered/day for experimental group 
(0 ■ varied; - ■ not given) 

Hours after drug Ingested, dependent variable w«s m.,su^^^ 
exoerlmental group t- ■ not able to determine, ts.o tu^iu» uk 
stSdv- 0" ■ data collected continuously; use fora — •-) 
Basis of dosage for control group (1 ■ varied by Ss weight; 
z" ItSndard across Ss; 3 ■ clinical decision; - ■ can t tell) 
Beginning dally average for control group CO ■ varied; -■ not given; 
dosage always In m?/kg. use decimal point with form --.-) 

Ending dally average for control g:o"P,(0- Vl'?'"J«ii ' " 
dosage always In mg/kg. use decimal point with font. — -) 

Number of times drug was administered/day for control group 

(0 ■ varied; - ■ not given) 

Hours after drug Ingested, dependent variable was 

control group (- ■ not able to determine; "-9.0" ■ follow-up study. 

"-1.0" ■ data collected continuously; use form — 



Ma j or Tranquilizers 



DRUG TYPE 
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Phenothlazlncs 

1 Chlorpromazlne (Thora2lne» Largactll) 

2 Thioridazine (Mellaril) 

3 Procholoperazlne (Compa2lne» Stefl>etll) 

4 Trifluoperazine (Stelazint) 

5 Promazine (Sparine) 

6 Perphenazine (Trilafon) 

7 Trlflupromazlne (Vesprin) 

8 Mepazlne (Pacatal) 

9 Acetophenazlne (Tindal) 

10 Fluphenazlne (Prolixin, Pennltll) 

11 Promethazine (Phenergan) 

Rduwolfia Alkaloids 

12 Reserplne (Serpasll) 
Ch.orprothlxene 

13 Chlorprothlxene (Taractan) 

Minor Tranq uilizers 
Diphenylmethane Derivatives 

14 Diphenhydramine (BenadryJ) 

15 Hydroxyzine (Atarax* Vistarll) 

16 Captodlame. captodlamlne (Suvren, Covatin) 

17 Azacyclonol (Frenquel) 

Substituted Propanediol Derivatives 

18 Mephenesin (Tolserol) 

19 Meprobamate (Mil town, Equanll, Trelmar) 

Benzodiazepines 

?0 Chlordlazepoxide (Librium) 

21 Diazepam (Vallum) 

22 Oxazepam 

23 Hitraz^pa'n 



Butyrophenone Derivatives 

24 Haloperldol 

25 Fluperldol 

Miscellaneous 

26 Phenaglycodol (Ultran) 

27 Benactyzlne (Suavltll, Oeprol) 

Stimulants and Antidepressants 

Amphetamines 

28 Amphetamine (Benzedrine) 

2g Dextro-amphetamlne (Oexedrlne) 

30 Deanol (Oeaner) ^ 

31 Methylphenldate (Ritalin) 

32 Pipradrol (Meratran) 

33 Ipronalzld (Marsllld) 

34 Imlpramlne (Tofranil) 

35 Nialamide (Niamid) 

36 Phenelzine (Nardil) 

37 Phenyl Isopropylhydrazlne (Catron) ^ 

Compounds Part of Normal Metabolic Processes 

38 Hormones 

39 Vitamins 

40 Glutamic Acid 

Anticonvulsants 
Miscellaneous 



41 
42 
43 
44 



45 
46 
47 
48 
49 

50 



Neostigmine 

Celastrus Panlculata Seeds 

Pure Caffeine 

Cofr'et 

Biochemical 

Desoxyrlbonucleic Acid (DNA) 
Ribonucleic Acid (RNA) 
Puromycin 

Magnesium Pemoline 
Other 



(speciry) 
Placebo 
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COMPUTATION Of FFFcCT SIZES 



(Article 10 0 



ES#S 



ES#6 
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' c XrtTcTeT 10- 



t. ATtlTUDES OF CHIlCREN, PARENTS. TEACHERS TOWARDS 



8. REL-^TIO^SHIP BETWEEN HYPE^^ACTIVITY AND LATER 
DELlNOUErO 



2. SIOE EFFECTS OF DRUGS 



9 OEFI.'IITION OF PHEVALENCE OF HYPERACTIVITY. DRUG 
USAGE, ETC. 



3. CHANGES IN CMS (EVOKED POTENTIAL, ETC.) OR 
PHYSrCLOGICAL FUNCTIONS AS A RESULT OF DRUG 
USAGE 



10. ASSESSMENT DEVICES AND INSTRUMENTS 



4, CD DRUGS AFFECT HYPERACTIVITY DIFFERENTIALLY 
IN STRUCTURED (FORMAL) AND UNSTRUCTURED 
(I^iFCnf^AL) SITUATIONS? 



n. EFFECT OF DRUGS ON MOTHER/CHILD AND/OR PEER INTERACTION? 



5. 00 CHILDREN WHO HAVE BEEN MEDICATED FOR HYPER- 
ACTIVITY HAVE GREATER DRUG DEPENDENCY PROBLEMS 
LATER IN IIFE? 



12. • AGREEMENT BETWEEN OBSERVATIONAL DATA AND RATINGS; 
OR TEACHER RATINGS AND PARENT RATINGS 



6. :S THERE EVIDENCE THAT CHILDREN "GROW OUT" O; 
HVPEhACTIVITY AT PUBERTY? 



13. PROBLEMS AND GUIDELINES FOR COLLECTING CBSERVATICNAL 
DATA ON HYPERACTIVE CHILDREN 



7. OIFF£PE*.C£ IN THE RESPONSE OF HfPEOACTlVE AND 
Ar.RCSSIVE CHlLDPEi TO CPUGS? 



U. RELATIONSHIP OF HYPERACTIVITY TO THE AfOUNT OF 
DISTRACTI8ILITY IN THE SETTING? 



ERIC 



85 



COMMENTS ON COOING CONVENTIONS 



Article ID? 



Notes on Clarification and Expansion of Conventions (note page nos. from article) 



Notes on Disagreements with Conventions (note page nos. from article) 



Appendix 3 
Coding Conventions for Efficacy of Drug 
Treatments for Hyperactivity 



HYPERACTIVITY META-ANALYSIS CONVENTIONS 



Contained in this document are the conventions or basic rules for 
coding the hyperctivity research articles. Additional examples of how these 
basic rules have been applied are contained in the conventions notebook. 
While coding articles, these rules should be used to make most decisions. If 
an Item is impossible to code usinq these rules, the item should generally be 
coded Occasionally t)owever, educated guesses are possible. For example, 

if the study were done at a "boy's" reform school, item 11-10 (% males) should 
be coded 100; and item 11-13 {% delinquent) should be coded 100 even though 
this information was not 'specifically qiven. Another example would be if a 
parent completes aVating scale on the child's level of activity but the 
setting is not mentioned In this case #VII and (settinq) should be coded 
"home." When guesses are made include a brief explanation on the "coiments or 
conventions" page so the example can be incorporated into the conventions 
notebook. Guesses should be the exception rather than the rule and should 
only be made when you ace confident about the accuracy. 
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I INTRODUCTION 



1. Study I0# - taken from photocopy of article 

2. Year - year of publication 

3. Source - where published, coded from list of journals 

4. Supported by commercial company 

1 = Article stated source(s) of support and the study was completely 

supported by a commercial company. 

2 = Article stated the source of support and the study was Pf'^Jj^^^y 

supported by a conimercial company. Code 2 if a ^f-^^^^^^^^^^^^/^^^^ ' 
commercial company served as a consultant or if the company donated 
materials and/or equipment. 

3 = Article stated source(s) of support and it was not a commercial 

company. 

4 = Article did not state source(s) of support. 
5, Dissertation 

1 = Article was based on a dissertation and the ES had to be estimated 
using procedures other than. 

0 = Article was a dissertation, or an article based on f .^j^^^'^J^Ji^" 

where the ES did not have to be estimated, or an article not based on 
a dissertation. 

6 Side Effect - Article reported side effects of treatment such as weiqht 
loss, qrowth suppression, insomnia, changes in heart rate or other 
physioloqical or psychological functions not directly related to the 
manifestations of hyperactivity. 

1 = Yes 

2 = No 

- Specify side effects in blank next to item on coding sheet. 



00 



n DESCRIPTION OF SAMPLES^ 2 



1 . Mean Aoe 

• Report in months 

- If rounding is necessary, .5 or greater round up, below .5 round down. 

- If ranqe only is reported, use midpoint as best estimate of mean, e.g., 
subjects were 9-10 years old, X - .5 years « 114 months. 

2. ^tean IQ 

1 * 130 or more 

2 = 71 - 129 

3 = 70 or less 

4 = not reported 

- If range only is reported, use midpoint as best estimate, e.g., 
subjects' IQs ranged from 90-110, X = 100. 

- Specify test used in blank next to item on coding sheet. 

3. Size of Sample - Number of subjects at time data was analyzed. 

4. Socioeconomic Status (SES) - Specify how SES was determined on coding 
sheet. Examples 11: Low SES would be Title I recipients, or low income 
subjects. Middle SES would be blue collar, or lower management families, 
high SES would be children of university professors, doctors, or upper 
management. Code as 4=mixed if the group contains a mixture of SES. If 
article -states that subjects were low middle or high without determining 
how It was determined, use authors statement. 



1 For all 1t«ns in Section II, assume subject irortality is proportioned 
unless otherwise stated. In other words, compute the percentages in each 
group at the beginning and don't change the percentage as a result of 
subject mortality unless the article specifically states how many were 
lost from each group. 

2 If the article states that some of the sample belonged in categories 7-13 
but does not specify the exact percentage, code the relevant category 
"-9" to indicate some. 



5. Severity of Hyperactivity 



1 * None - none of the subjects were hyperactive. 

2 - Mild - all subjects in regular education classes. 

3 » Moderate - one or more subjects receiving special education services 

outside of the regular classroom but no more than half of subjects in 
self-contained classroom. 

4 = Severe - more than half of subjects in self-contained classroom. 

5 = Extreme - half or more of subjects institutionalized because of 

hyperactivity related concerns. 

6 ' Unable to tell - article gives insufficient information for 

determining severity of hyperactivity. 

6. Diagnosis - What the author(s) most often call the condition. 

7. % Handicapped - Code the X of children in the sample In categories A-G 
below. If the sample has ''some" MR children but doesn't say how many, 
code MR "-9". Use the same rule for other handicapping conditions. 

A. Multiple - Children having two or more handicaps. 

B. Deaf 

C. Blind 

0. MR - Mentally retarded. Subjects* IQs are below 70. 

E. Learning/Perception/Minor Neurological Problems - Children referred 
as LD (learning disability), MBD (minimal brain dysfunction), that 
exhibiting soft neurological signs (e.g., low scores on perceptual- 
motor tests, coordination problems, etc.) or EEG abnormalities 

the author(s) refer to as minor. 

F. Gross Neurological Problems - Obvious physical trauma, or EEG 
abnormalities. Count in this category children suffering from 
seizures and for convulsions. 

G. ED - Emotionally disturbed, children referred to as neurotic or 
psychotic. 

8. % Institutionalized - Subjects are full-time residents of an 
institution. On this item subjects In an institution would be counted as 
inst1tutiondliz<?d whether or not their institutionalization had anything 
to do with Hyperactivity. 



0. X Minority 



« Include Black, Hi>panic, Native American, and inwiqrant subjects. 

- Do not include aner lean-oriental subjects. 
10. X flaljj 

i;. Subjects on Active Druq Other Than the Drug Being Investigated 

- Subjects were on an active druq other than the druq investiqated at a 
time that would have effected the ES for the dru<r bein<^ investiqated. 
For exainple, if the design required usinq the time pretest/baseline 
measures in computing an ES and subjects were on an active druq durinq 
baseline, this should be coded. 

- Check druq list for time to become active/inactive in system. 

12, Allerqies - Author(s) state that subjects were allergic to some 
substance. 

13. Delinquent - Author(s) state that children were delinquent or had been 
trouble with the law. 
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III CLASSIFICATION OF CHIIDREN AS HYPFRACTIVE 



This section describes the basis by which the child was classified as 
hyperactive. It should not be confused with other criteria used in selectinq 
the sample such as IQ, EEG abnormalities, sex, aqe, etc. These other 
characteristics are coded in Section II. This section deals only with the 
information which was used to decode whether the child was hyperactive. For 
each qroup (experimental & control) code separately each source of information 
collected. For example, a study miqht have used a parent's opinion about 
general hyperactivity at home (this should be coded 1122 under "a.") and also 
a classroom observation of activity level collected by the researcher (this 
should be coded 2341 in b) and a test of impulsivity such as the Matchinq 
Familiar Forms Test (this should be coded 5644 in c). For both experimental 
and control qroups code as many separate methods as were used to classify the 
children as hyperactive and list in a-d. If the ''control** group was 
non-hyperactive the basis by which children were classified as hyperactive is 
irrelevant- therefore code "a" under control group as 8000. If article states 
only that children were referred by parents or teachers as being hyperactive, 
code basis * 1 (general hyperactivity) and instrument « 1 (opinion). 

Basis for Selection - Code 4 digits indicating basis, instrument, evaluator, 
and setting. 

1 » General Hyperactivity - Basis for selection was a composite measure of 

hyperactivity in which characteristics such as activity, attention, 
agression, or impulsivity could not, or were not, separated. Or the . 
article states that children were referred by parents because of 
problems with hyperactivity, 

2 = Activity - Index reflecting general motor activity. 

3 « Attention/Vigilance - An index which requires the ability to sustain 

attention/vigilance as a primary focus. 

4 * Aqgression - Index reflectinq a subjects tendencies or actual 

behaviors which are intended to destroy or cause injury. 

5 * Impulsivity - Any measure which yifelds an indication of whether a 

child makes decisions too rapidly, feels to pause to consider possible 
alternatives, feels to reflect on possible consequences of a decision, 
and/or seizes on the first response that comes to mind. 

6 « Other - Any other basis on which subjects are diagnosed as 

hyperactive. Specify basis on coding sheet. 

Instrument 

1 « Opinion - Global impression. 

2 * Rating - Placement on a scale/rating form via opinion or recall, 

3 * Systematic Observation - Systematic recording of one or more aspects 

of the child's behavior, e.g., frequency, intensity, duration, etc. qq 



4 - Aclometer - Any instrument used to automatically record movement. 

5 » Perceptual -motor Test - Standardizeu test designed to measure a 

child's ability to coordinate sensory information and movement. 

6 = F'^perimental Task - Any task designed specifically to serve as an 

index of hyperactivity or some aspect of it. Specify task on codinq 
^neet. 

7 = Impossible to Determine - Cannot be determined from article what 

instriinent(s) were used to assess hyperactivity, 

8 » Other - any other instrument used to assess hyperactivity. Specify 

instrument on coding sheet. 

Evaluator 

1 = Teacher - Person with primary responsibility for providing the child 

with instruction. 

2 = Parent - or legal guardian. 

3 = physician - MD. 

4 - Researcher - Any person instrumental in the conceptualization and/or 

desiqn of the study. Code physicians, psychologists, teachers, etc. 
who were on the research team in this category. 

5 = Observer - Anyone not listed in another category making systematic 

observations of the subjects' behavior, 

6 = Clinician/Psychologist - Any person whose background is primarily 

psychology and whose role is consistent with such background. 

7 = Caretaker - Any person whose role within a residential institution 

gives him/her primary responsibility for the care of the child. 

8 = Impossible to Determine - Article does not state who evaluator was. 

9 = Other - Any other person who evaluated the children to determine if 

they were hyperactive. Specify person on codinq sheet. 

Setting (Code setting where observations of child were made) 

1 ^'School - Classroom or other area of school building. 

2 = Home - Residence other than institution. 

3 Clinic/Ooctor 's office - Any medical facility, 

4 « Experimental Setting - Any non-natural ly occurring setting established 

specifically as an area/situation in which to observe/record 
hyperactivity. 
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5 » Day Care - Place other than educational facility or home where child 

is cared for during the day. 

6 - Institution « Any residential facility for the treatment of 

handicapped or disturbed persons. 

7 - Composite = A combination of settings where hyperactivity was 

assessed and it is impossible to separate the various settings. 

8 « Impossible to Determine - Article does not state in what settinq the 

children were assessed as hyperactive. 

9 « Other « Any other setting in which children were assessed as 

hyperactive. Specify setting on coding sheet. 

General Quality of Procedures Used to classify Children as Hyperactive 

1 « Good: objective measures by 2 or more people jor in 2 or more 

settings. 

2 = Fair; objective measures used by only 1 person in 1 setting— may or 

may not have used additional subjective measures. 

3 « Poor: only subjective measures used. j 

4 » Unable to tell; in those cases where it is impossible to determine 

from the article how children were classified as hyperactive, in co- 
ding the general quality of the procedures used to classify children 
as hyperactive opinions and ratings wi^^ generally be considered 
subjective measures and systematic observatin^ actometers, other 
electro/mechanical recording devices, and perceptual mo.tor and other 
experimental tasks will generally be considered objective. However, 
if a rating scale were done by blind observeri and included good cri- 
teria, it would qualify as an objective measure. Or if an observation 
was done by non-blind observers and/or used vague criteria, it would 
be a subjecative measure. Using similuar rationale, other exceptions 
may be used. Be sure and not use exceptions for conventions book of 
examples. 
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IV TREATMENT 



1. Treafnent Type 

2 « This wj^ the only treatment given the sulvlects, subjects may have 
received other treatments consecutively but^not concurrently. 

1 - Thi^ was one of several concurrent treatments qiven the subjects. 

0 « This was not a treatment given the subjects. 

- Drun - Any treatment where a nonfood substance was administered to the 
subjects. 

- Behavioral - Any systematic manipulation of the subjects* environment or 
reward/punishment system. 

- Diet - Any treatment where a food substance was added to or deleted from 
the subjects* diet. 

- Biofeedback - Any treatment where the subjects are qiven feedback on 
some parameter of their physiology. 

- Comparison of Groups - Any study where the behavior of hyperactive 
(hildren was conpared to that of another group of children in the same 
environment but no treatment was involved. 

2. ' Control Group 

A. Selection - How comparison qroup was selected. 

1 = Random - Randomly assigned to experimental and control groups from 

same initial group. , 

2 - Convenience - Selected simply because subjects were available. 

3 = Matched - Selected because they matched experimental group on some 

parameter. 

4 = Own - Experimental group was observed pre-post treatment or under 

two conditions and served as its own control. 

5 - Crossover - Control group was formed by hayinq a 

crossover design in which all subjects received 
both the experimental and the control (or 
placebo) conditions. 

6 - Impossible to Determine - Article does not state how 

the control group was selected. 

7 = Children in the "control" group were non-hyperactive. 

8 « Other • Any other basis on which the control group was selected. 

Specify basis on coding sheet. 



B. Treatment - Condition under which control group was observed. 

1 = None - Received no treatment. 

2 ' Placebo - Received a placebo treatment. 

- Quality of Placebo 

4 » Present, Can't Tell - Placebo was given but article contained 
insufficient information for determining its quality. 

3 « Good - Precautions were taken to assure that the placebo could 
not be distinguished from treatment. For example, pills 
should be the same size, shape, color, and taste to qualify as 
a good placebo. 

2 = Poor - Placebo could be distinguished from the treatment. For 
example, pills should be the same size, shape, color, and 
taste to qualify as a good placebo. 

0 * ^sent 

3 * Druq - Received a drug treatment, specify on coding sheet. 

4 « Behavioral - Received a behavioral treatment, specify on coding 

Sheet. 

5 - Diet - Received a diet treatment, specify on coding sheet, 

6 = Biofeedback - Received a biofeedback treatment, specify on coding 

sheet. 

7 « Impossible to Determine - Article does not specify treatment 

received by control group. 

8 = Other - Received any other treatment, specify on coding sheet. 

Specify treatment on coding sheet. 

3. Duration of Treatment in Days - Number of Days from first day of treatment 
to last, include the minimum number of weekends that could have occurred 
in the time span. In cases where there is no treatment (e.g., comparison 
of groups) code this -1. 

4. Days>After Treatment the Dependent Variable was Observed - Time from end 
of treatment to when outcome measure was made. Code as "0" if Dependent 
Variable was measured when day treatment was still being administered. 
Code as -1 if no treatment was given. 

5. Reliability of Treatment Implementation - Confidence one can have that 
treatment was implemented as described. The article may present data on 
the reliability of treatment implementation, for example, state that two 
observers watched a teacher implement a behavior program and that she was 
90% reliable In implementing it. Alternatively, the article may present 
Information that allows an estimate of the reliability of treatment 

I—* 
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djta or basis for an estimate is qiven, reliabil ty of treatment 
iinplementation may be estimated usinq the convention in 1-5 below. 

1 . Co^Tlote - Reliability data is presented or can be estimftted and is 
.9b or better. s 

? - Miiw - Reliability data is presented or can be estimated and is 
^ or no reliability d'ta is -nven but treatment was implemented 

by a professional, teacher, researcher, physician, etc. 

3 = Moderate - Reliability data is presented or can be estimated and is 
.70-. 79 or no relia'bility data is presented but treatment was 
implemented by parents pr paraprofessionals. 

4 - Major - Reliability data is presented or can be estimated and 1s 

.69 or less. 

5 = t-npossible to Determine - No reliability data is presented and it 

IS not clear from article who Implemented treatment. 

fi = No troatmont qiven - In situations where groups are being compared 
(" q hyperactive children vs. non- hyperactive children) there 
iTno'treatment and consequently, no reliability of treatment 
v"np lemon tat ion. 

c 

K. Confidence With which #5 was Coded 

1 = Data Based - Article gives data-based information about how 
completely the treatment was implemented. 

? = Giie,s Based on Oata - Article gives information from which a 

cuetJicient describing the reliability of treatment implementation 
can be estimated. 

3 = Convention - The reliability of treatment implementation was 

estimated based on who implemented the treatment (see #5). 

4 = Impossible to noterniine - *5 was coded S because the article gave 

was insufficient information regarding how treatment 
imp le'ientation qiven. 

5 • not applicable - This item would be not applicable in cases where 

groups are being compared or where #IV-5 was left blank. 
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V DESIGN 



1 . Type 

1 = Random Assignment - Subjects are randomly assigned to groups. 

2 = Non-Random but Matched - Not randomly assigned to groups but 

control subjects were matched to experimental subjects on some 
parameter, 

3 ^ Convenience - Basis for selecting subjects was that they were 

available. 

4 « Pre-post, No Control - Experimental group was observed under two 

or more conditions and no crossover design was employed, 

5 = Single Subject - Data is presented as graphic display for 

individual subjects. 

6 = other - Any design. Specify design on coding sheet, 

2, Blinding 

1 «^ Yes • Individual definitely blind, 

2 - Probably - Individual was not told the purpose of the study and/or 

what subjects were under what conditions but possibly could have 
figured it out, 

3 * No - Individual definitely was not blind, 

4 - Can't tell - Article did not give enough information to determine 

if individual was blind 

- Subject - Individual (s) for whofn treatment was implemented, 

- Treatment Implementor - Individual (s) who implemented treatment. 

- Data Gatherer - Individual (s) who collect data about Dependent 
Variable. 



V I . TjiRLATSjp VALIDITY 

General Convention: Each of the "threats/* listed below should be coded 
using the following conventions. Definitions and examples of the 
"threats" follow the general convention. Two are contained in conventions 
notebook. 
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0 » Not plausible threat to internal validity. 

1 * Potential minor problem in attributing the observed effect to 

treatment; by itself, not likely to account for substantial amount of 
the obberved rei^ults. 

2 = Very plausible alternative explanation which could account for 

substantial amount of the observed results. 

3 ' Very plausible alternative explanation which by itself could explain 

most or all of the observed results. 

Maturation - Definition - Biological, physiological, or psychological 
"procj:»s«;es v/ithin the respondents varying systematically with the passage 
of time" but not as the result of specific events (including the 
experimental treatment) external to the respondents, e.g., growing' older , 
more tired, better coornlina'ted, etc. Suppose an experimenter claimed that 
a series of prescribed play activities were effective in promoting bladder 
control in infants; as evidence he showed that 2X of the 15-month old 
infants starting his experiment had control, and 75% of these infants 
achieved control 9 months later. His claim is questionable since the 
normal infant develops bladder control during this period naturally. 

Testing 

The effects of taking a test on the outcomes of subsequent administration 
of the same or a highly related test. Taking some cognitive-ability tests 
may increase your score by several points on a second administration of 
the same test or a parallel form of it. For example this would be a 
threat If children wei e tested repeatedly with the same test instrument 
and no control group was included in the design. 



Instrumentation ^ 

Changes in the instruments (tests, judges, experiment are observed may 
produce changes in the scores over time which are mistaken as treatment 
effects. For example, judges observing and rating some performance may be 
more lenient from time 1 to time 2. 

Statistical Regression - The inevitable tendency of persons whose scores 
are extreme (high above or far below the mean) on Measurement A to be less 
extreme (less high above or less far below the mean) on Measurement B. 
This phenomenon of "regression toward the "mean" will be observed whenever 
Measurements A and B are not perfectly correlated, which for all practical 
purposes is always. For example, this will be a threat if children in the 
experimental group were selected on the basis of an extreme score which 
was used simultaneously as a pretest and. there was not a control group or 
the control group was not selected on the basis of the same extreme 
scores. 

Selection Bias - Children in the experimental and control group were 
selected on different bases. Definition - All of those factors which 
conspire to make the experimental and the control groups unequal at the 
outset of an experiment in ways which cannot be properly taken into 



account in the analysis of the data, for example, selection miaht 
invalidat'^ a comparison of curricula A anO B if older, more experienced 
teachers were selected to teach the more difficult curriculum. It appears 
that in almost all instances the only feasible way to completely guard 
against selection bias is by employing the random assignment of persons or 
classrooms to treatments and then using" statistical analyses of the final 
data which are based on the randomization procedure. Quasi experimental 
designs will almost always have some selection bias. Designs in which 
subjects serve as their own controls (pre-post, crossover) will usually 
not have selection bias, but often have other problems. 

6. Experimental Mortality - The differential loss or "dropping out" of 
persons from two or more groups being compared in an experiment. If 
attrition is greater under curriculum A than curriculum B, a comparison of 
A and B at the end of one school year might be biased in that the students 
completing A would be brighter— on the average— than those completing B. 
This is true simply because the slower students were fatalities under 
curriculum A. . _ • 

7. Novelty and Disruption - Measurement of the children's behavior was made 
in an environment that was new to them and it is plausible that the 
newness of the environment was responsible for different scores and no 
control group was included in the design of the study. 

8. Experimenler Effect - Attitudes of experimenter regarding expected 
research results are known to treatment implementor, data collector, or 
subject. 



2. General Index of Validity. 
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w\\ executed 
truf experi- 

well execuud 
double blind 
crossover de&trjns 
with order effects 
b«1«nced and suf- 
ficient tir« for 
previous drunt to 
become inactive 



• true experiflMt*] 
dL'sit)n$ with minor 
probtettis (1-3 "I" 
r«tinqs) 

• well eupcuted 
quasi pxperinrntal 
flesiins (no "1" 
except for selec- 
tion 

• wel I executed 
sinnle subject 

• cro'^sover desinns 
with mi not' 
problems 



Only "r ratinqs. 
no rmre than 3 
points 



• qu.ui experimental 
desiqns with minor 
problems (l-J "1" 
ratings or 1 "2" 
rating) 

• well executed pre 
post dfisiqns (no 
"1" besides 
selection, matura- 
tion, history 

• sinnle subject with 
minor probleins 

• truf> extierircnta^ 
with (nodcrate 
problems (2-4 T 
ratinu; or I-!. "2" 
ratings) 



Only "t" nr "2" 
rat Inns, no nore 
than 6 points 



• pre post desiqns 
with minor to 
moderate problems 
(2-4 "1" ratinns 
or 1-2 "2" ratinqs) 

• quasi experimental 
with moderate 
problems (6 or more 
points, with at 
least 2 "2" ratings) 

a true experimental 
with major problems 

• sinole subject with 
moderate problems 



• any dcsiqn with 
one or more "3" 
ratings 

t pre pc.'l dosinn«i 
with n!jjor priiblr':>s 
(6* points with At. 
least 2 "2" ratinas) 

• sinole suhject/c.ise 
studies with major 
problems 



VII OUTCOfAE 



Type of Measure 

1 = General - Index rjeflectinq qeneral hyperactive behavior. 

2 ' Coqnitive Performance - Any qeneral or specific measure of 

coqnitive ability such as might be obtained through an IQ test or 
one of Piagets conservation problems, 

3 - Attention/Vigilance - Index reflecting ability to sustain 

attent ion/v iq i 1 ance . 

4 = Physioloqical - Any measure of a physioloqical parameter, 

5 = Affective - Any general or specific measure of the perception of 

oneself or others, or the ability to relate to others such as 
miqht be obtained by a self-concept or sociometric test 
respectively. 

6 = Activity " Index reflect inq general motor activity. 

7 = Aqqression - Index reflecting behavior(s) defined as indicative of 

aggression. 

8 = Impulsivity - Index reflecting behavior{s) defined as indicative 

of impulsivity. 

9 = Achievement - Index reflecting learning in one or more areas. 

la = Other - Any other outcome measure. Specify measure on coding 
sheet. 

Instrument 

1 * Opinion = Global impression. 

2 = Rating - Placement on a scale via opinion. 

3 = Systematic Observation - Systematic recording of one or more 

aspects of the child*s b^ehavior, e.g., frequency, intensity, 
duration, etc. 

4 = Actometer - Any instrument used to automatically record behavior. 

5 = Standardized Test - Test for which norms have been collected. 

Specify test on codinq sheet. 

6 = Experimental Task - Any task desiqned specifically to serve as an 

index of hyperactivity or some aspect of it. Specify task on 
^ codinq sheet. 
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7 = Impossible to Oetermii\e - Article does not state what instrument 

was used to measure outcome. 

8 - Composite - Any combination of instruments used to measure outcome 

and yielding one score. 

9 « Other - Any other instrument used to measure outcome. 
Data Collector 

1 = Teacher - Persons with primary responsibility for providing the 

child with instruction. 

2 Parent - or legal guardian. . 

3 Physician - MD 

4 = Researcher - Any person instrumental in the conceptualization 

and/or design of the study, 

5 « Observer - Anyone not listed in another category making systematic 

observations of the subjects' behavior. 

6 = Counselor - Any person whose background is primarily counseling 

and guidance and whose role within a school or other institution 
is consistent with such background. 

7 = Clinician/Psychologist - Any person whose background is primarily 

psychology and whose role is consistent with such background. 

8 = Caretaker - Any person whose role within a residential institution 

gives him/her primary responsibility for the care of the child. 

9 = Subject - Individual receiving treatment or being used for 

comparison to determine treatment effects, 

10 = Composite - Any combination of data collectors whose measurements 

yield one score. 

11 = Other - Any other person who evaluated the children to determine 

if they were hyperactive. Specify person on coding sheet. 

Setting (Code setting where observations of child were made) 

1 = School - Classroom or other area of school building. 

2 - Home - Residence other than institution, 

3 = Clinic/Doctor's office - Any medical facility, 

4 = Day Care - Place other than educational or home where child is 

cared for during the day. 



5 - Experimental Setting - Any setting established specifically as an 

area/sUuation in which to observe/record hyperactivity. 

6 » Inst itut 100 » Any residential facility. 

7 » Cofpposite «*Any comtMOdtion of settings where hyperactivity was 

assessed. 

8 « Impossible to Determine - Article does not state in what setting 

the children were assessed as hyperactive. 

9 * Other - Any other setting in which children were assessed as 

hyperactive. Specify setting on coding sheet. 

5. Reliability of Outcome Instrument 

0 a not given 

1 * 1.0 - .80 

2 - .80 - .60 

3 = .60 or below 

In cases where the reliability of the instrument is not specified 
in the article, but is known from the published norms for subjects 
which are similar to the subjects in the study, use the published 
reliability. If published reliability and reliability reported in 
study conflict, use the reliability reported in the study, unless 
there is some reason to doubt it. 

6. Treatment Implementor was Data Collector 

1 « Yes 

2 - No 

3 = Impossible to determine 

7. Setting where Outconp Measure was T^ken was 

1 = Structured - Any setting where reguirements are made on the child, 
for example, a classroom where he/she must stay in his/her seat 
and do assigned work. 

2 - Unstructured - Any setting where no reguirements are made^on the 

chil^, for example, a playground where the child can do anything 
he wants. 

3 = Mixed - Code 3 when the article states that outcome was measured 

in both structured and unstructured settings and measurements were 
combined into one index. 

4 = Impossible to determine - Article does not state in what setting 

outcome measures were made. 
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Vin CONCLUSION 



Effect Si7^ (ES) - "^STIT ^^^^ '^stimatf*. ^Express as positive or 

negative I'^latlve to the desired eff »h t. of treatment. For example, if the 
treatment decreases hyperactivity, calculations from the formula may leave 
a negative effect size which should, however, be expressed positively 
because the desired effect of treatment is to decrease hyperactivity. 
Alternatively, if the treatment increases drowsiness, calculations from 
the formula may leave a positive effect size which should, however, be 
expressed negatively because this is not a desirable treatment effect. 
Show formula used and calculation on coding sheet. 

In calculating effect sizes when I's and SD's are not given, the estimates 
must sometimes be made. The following conventions have been adopted for 
some of the most freguently reguired estimates. 

Correlations: Between two standardized psycho-educational tests 
between two rating scales 

Reliabilities: Rating scales - .60 

Oata ES was Calculated From 

1 = Means and control group SO - Article gave means for the 

experimental and control groups and a standard deviation for the 
control group from which ES was calculated. 

2 ' Means and pooled SO - Article gave means for the experimental and 

control groups and a pooled standard deviation from which the ES 
was calculated. . . 

3 = Means and published test SO - Article gave means for the 

experimental and control groups and the standard deviation was 
known for the published test used as an outcome measure. ES was 
calculated from these data. 

4 e t rati'o/F ratio from one-way ANOVA - Article gave a t or F for one 

way ANOVA value from which ES was calculated. 

5 ~ t ratio from matched pairs t test or F ratio from repeated 

, measures or other complex ANOVA design. 

6 « Non parametric test statistic except chi squared. 

7 « Probability estimate for t test or one-way ANOVA - Article gave a 

p-value from which a t or F from a one-way ANOVA was calculated 
and ES was calculated using these estimates. 

8 ' Source of variance estimate for n-way ANOVA - Article gave a 

source of variance table for n-way ANOVA from which ES was 
calculated. 

9 « Source of variance table from ANCOVA, repeated measures, etc. 
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10 ^ Regression lines 

11 = Proportions 

12 » Chi square 

13 * Other - Any other basis on which ES was calculated. Specify basi 

on coding sheet. 

3. Scale of Means Difference 

1 * Final status measure - Raw or standard scores were used to 

calculate means. 

2 » Raw gain score • Difference between pretest and posttest 

scores were used to calculate means. 

3 « Residual gain score « Pretest and posttest scores were correlated 

the correlation was used to predict posttest score from pretest 
score* and the difference between the predicted and the obtained 
posttest scores were used to calculate means. 

4 « Covariance adjusted scores - Outcome scores were correlated with 

scores on a covariate and adjusted to represent the outcome 
scores that would have been obtained if all 'subjects had obtained 
the same score on the covariate and used to calculate means. 

4. Statistical Significance 

3 digit p value « provide p value for test of statistical 

significance. If intervention reduced hyperactive condition or 
associated variable* p value should be low (e.g.* .010* .030* 
.070). If Intervention Increased hyperactive condition* p value 
should be high (e.g.* .940* .980* .990). In latter case* the 
obtained p value from data showing a mean difference favoring the 
control group will be subtracted from 1.00. 

- » Not given - Article does not present Information on statistical 
significance of treatment. 

5. Author's Conclusion 

0 » not considered • author(s) make no statement regarding clinical 

significance of treatment. 

1 • intervention appears to work - author{s) conclude that treatment 

works. Those cases where the author concludes that the interven- 
tion works but only for certain subsets will usually be accounted 
for by the different ES categories. If this does not account for 
it* code it "^l" anyway. 

2 * Impossible to determine - author(s conclude that effect of treat- 

ment can't be evaluated for some reason. 

3*» intervention doesn't work - author{s) conclude that treatment 



doesn't work. 
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Appendix 4 
Procedures for Contacting Authors 
' for Additional Information 
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REQUEST FOR ADDITIONAL INFORMATION 



Procedures: 



Obiectives: 1) Where it is impossible to calculate an ES from the journal 
^ ^ ' article, TO OBTAIN INFORMATION NECESSARY TO CALCULATE AM ES. 

2) Where it is possible to calculate an ES but the ES must be 
estimated using "cookbook" procedures, TO OBTAIN liiFORKATION 
ON MEANS AND STANDARD DEVIATIOi;S SO THAT THE ADEQUACY OF 
"COOKBOOK" PROCEDURES CAN BE EXAMIIiED. 

After reading the article, if either of the above objectives are 
relevant, fill out the information on the attached form and give 
the form to Marilyn. The number of dependent variables listed 
on the form will be determined by the number of ES you can 
differentiate from the article. In other words, an article 
may describe a study which collected data and reported F ratios 
for three dependent variables for both brain injury and non-brain 
injured hyperactive child*-en. From this article, you would 
probably want the following six ES's. 

1) Disruptive behavior - brain injured children 

2) California Achievement Test - reading subtest - brain 

injured children ^ u • 

3) California Achievement Test - math subtest - brain 

. injured children . 

4) Disruptive behavior - non-brain injured children 

5) California Achievement Test - reading subtest - 
non-brain injured children 

6) California Achievement Test - math subtest - non- 
brain injured children 

Do not request information for subgroups within the study that 
the author did not consider in the original article. In the 
above example, the three dependent variables should not be 
broken down by brain injured and non-brain injured unless the 
author in the original article considered differences between 
brain injured and non-brain injured children. 



Finding Addresses: 



If address is provided in article and publication is 
since January, 1978, use it. If address provided, but 
article before January, 1978, check procedures noted 
below and use most recent address. If no better aaaress 
is available, use the one in the article. It no aaa>*e" 
is noted in the article, check recent APA Directory, AlRA 
directory, or Directory tQ Faculty of American Colleges 
and Universities. If none of these work, gi.ve up. 



11/ 
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Date ( ) Data no longer available 

( ) I would rather not respond 




I am directing a project funded by the National Institute of Education to 
integrate the previous research which has examined the effectiveness of various 
types of intervention for hyperactivity. We have obtained your article 

/?) which appeared in ^ . 

— ^ (was presented at) 

The article discusses some interesting- research- which we would Tike to Include 
in our integration effort. However, we did.- not find all of the information we 
needed. Would you be willing to send additional information regarding the means 
(5f) and standard deviations (SD) for the following dependent variables reported 
in your study? 



Dependent Variables 



SD 



1, 

2. 
3. 



" Enclosed is an addressed envelop which requires no postage. If it would be 
easier for you. the information could be filled out directly on this form and 
returned withoit any need of a separate cover letter. If tje/elevant data are 
no longer available, or if you would rather not respond to this request, please 
check the appropriate box in the upper right hand corner of this letter and 
return it so that we will not bother you with follow-up requests. 

^ank you for your consideration of this request. I look forward to 
hearing from you in the near future. 

Sincerely. 



Karl R. White. Ph.D. 
r* Director. Planning & Evaluation 



o 11 
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Dear © 

I am directing a project funded by the National Institute of Education to 
integrate the previous research which has examined the effectiveness of various 
interventions for hyperactive children. In our search we came across a 

reference to your article i — , ■ 

which was JP^tu^if ^ at 4 . 

I would like to obtain a copy of that article. If it has subsequently been 
uS Shed in complete form, could you provide me with the re erence not 
would it be possible for you to send me a copy? I would be happy to reimburse 
^^ou for the cost? of reproduction and mailing. 

If the article is not available, I would appreciate it if you would return 
this letter witi a note ta that effect so that I will trouble you with 
follow-up requests. Thank you for your consideration of this request. 

Sincerely, 



Karl R. White, Ph.D. 

Director, Planning & Evaluation 
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Date 




Dear O 

Recently I wrot^- to you requesting some additional information about the 

research vou reported in an article entitled ^ (sV — ^ 

Attlched is a copy of that letter in case the original went astray in ^ne postal 
system f yoS have been too busy to respond, or the first letter arrived at an 
ij;ronv;nient^time..I understand, since I have frequently b^en in similar 
situations myself. 

The additional information which we requested ^'l^ 
i-h^ crrpcc nf our oroiect. Although you are only one of dozens of researchers 
TromTho^we ha^e reqSeffed inf ormatioJI. every bit of information contributes 

Jave now] with the remainder to follow if and when you can get it. 
Thank you for your consideration of this request. 

Sincerely. 



Karl R. White, Ph.D. 

Director. Planning & Evaluation 



4 'J 



ERIC 



